A few days ago I talked about the Human Computation and the ESP Game. I haven't stopped thinking about it since...
What I was thinking was - why not leverage the principles of the ESP Game to replace CAPTCHA, and tranforming this "annoying, inproductive, necessary" thing (CAPTACHA) into something that is "annoying, productive, neccessary"?
Let me explain:
Image you are creating an account in GMail. At some point, instead of being given a CAPTCHA, you are showed an image of a little boy dressed-up like a cow-boy with a huge mustache. And instead of having to type these annoying skewed letters, you must type a word that represents what you see in the image (maybe with a "taboo" list that says that the word "BOY" cannot be used). Since "BOY" is already taken, you type "MUSTACHE" and your subscription goes on.
What happened in the background is that the system already had a set of potential tags for this image, that is, tags given by one user and still require acknowledgement by another (independant) user. Since the word MUSTACHE was one of these potential tags, the systems can deduce that there is indeed a person behind the computer and lets you continue.
How do you get this initial list? Why do you need it anyway? What happens if the word entered does not appear in the potential tags?
You need an initial list for two reasons:
- Without such a list you would have to play the real ESP Game - in which case you need to compare the words entered by two users. This could lead to too slow response if one of the "players" has entered his word and the second one is still thinking. You could solve this by having multiple players with the same image, but then it's becoming very similar to my idea of an initial list.
- You must make sure spammers won't take advantage of your system. With the ESP Game, they could launch thousands of simultaneous request, making them all enter the exact same word - they would stand large chance of being coupled together and provide you with really bad tags (not to mention that the whole idea of avoiding spam would be lost).
To generate this initial list, you need to actually show the user 2 images. One of them is based on an existing initial list (i.e. and image for which you already have a large number of potential tags). The second image has no potential tags at all - it's there to build an initial list for the future. The user will "pass" only based on what he entered for the first image - the second one will be judged in the future, once it's used as an initial list.
With this approach (2 images, one compared to an existing list and the other to build a future list), spammers can't fool you. If you leave enough time between the time you generated an initial list and the time you use the image for actual tagging, you can track words recurring very often for various different images, which could potentially be due to spammers. These tags will be removed and not used as potential tags. Thus, the spammers won't be able to fool you! Also, it avoids any delay and none of the users depends on other users.
Even if you collect hundreds of potential tags for each image, you could still get to a situation that the user entered a word that does not exist in the potential tags list. In this case, you could show him another image. Yet, if you want to avoid annoying the users too much, you can simply show him a CAPTCHA. The result being that each user is given exactly 2 or 3 images (in the latter case the third image is a CAPTCHA) and thus there is a concrete limit to the level of annoyance to the user. Of course, you can be really nice and provide the user a choice between a CAPTCHA and an image to tag.
Finally, this system could provide you a huge amount of image tags very fast. The advantage is that the taggers are very diverse, and the amount of taggers is much more than you would have in the case of a game.