Yesterday I discussed how the World Wide Web morphed into Web 2.0 by allowing Web users to collaborate to enhance the Web experience [Web 2.0 and Pink Goo]. An article in the Washington Post describes how this feature of Web 2.0 is now helping the blind have a better surfing experience [“Image Labeling for Blind Helps Machines ‘Think‘.” by Zachary A. Goldfarb, 21 November 2006]. Goldfarb begins his article by describing the problem:
As director of Web operations at the American Foundation for the Blind, Crista Earl knows more than most about how visually impaired people can access the Internet. Still, when she browses the Web, Earl, 48 and blind, finds it time-consuming and difficult to use. … Earl’s problem is that the program she uses to make the Internet accessible — a screen reader that speaks Web pages aloud — cannot describe pictures and images, an essential part of Web sites. Computers are not yet able to look at an image and know what it is. For the blind, the only solution is for each image to be labeled with an accurate description for the screen reader to say aloud. But few Web site designers do that.
Goldfarb notes that drafting non-blind surfers to label images for the blind could tap the Web in much the same way as have Del.icio.us and Wikipedia. The challenge he notes is that asking people to tag an image every time one appears on their screen would soon turn either to boredom or anger. One of the answers to this challenge has come in the form of a game.
To make it less tedious and more fun, Luis von Ahn, a computer science professor at Carnegie Mellon University, has created the ESP Game. Two random visitors to ESPGame.org are matched up and shown a random image, which they are asked to label. They cannot communicate. When both provide the same label, they win points. At the same time, computers are associating words with images, a valuable service for the blind. Von Ahn has found that the game is addictive — hundreds of thousands have played, with some spending more than 40 hours per week on the site — and goes a long way toward giving precise descriptions to images. His work is part of an emerging field of computer science called human computation, because the computer is posing the problem, and it is up to people to solve it. Google Inc. recently built a version of the ESP Game on its site, and this year von Ahn won the MacArthur Prize — known as the “genius prize.”
This approach is taking a unique path towards cognition. As the computer “learns” how humans label certain images, it begins to “understand” what certain images are or represent. This learning can then be applied to images found on the Web. By having a computer label images, the tedium of doing so is removed and a richer surfing experience is provided for the blind.
Peter Norvig, director of research at Google, says the image project is an extension of its main product — the search engine — which organizes search results by analyzing the content and links that people put up on the Web. “Most of what we try to do at Google is build automated solutions,” he says. “We use the human input and then write programs to harness that input.” The promise of von Ahn’s research is that it will allow computers to replicate the complex abilities of the brain. “What he’s doing is mining the ability of humans,” says Manuel Blum, a Carnegie Mellon professor who advised von Ahn’s dissertation. Von Ahn says he has one goal: “To be able to use all of this data and to have computers be able to do pretty much everything we can do.” The end result is a kind of artificial intelligence that would drive a computer to think and act like a human — the kind only seen in science fiction movies.
It is my belief that when business logic, cognition, and technology can be fully blended, we will achieve a Structural Singularity that will provide the basis for an entirely new kind of IT architecture. What kind? Who knows, that is what makes it a singularity. Intelligent computers will help IT architects design these frameworks and the results should be spectacular. Von Ahn is trying to tap the power of Web 2.0 to help computers become smarter faster.
Von Ahn envisions computers in the future translating foreign text while respecting the nuances of language or summarizing lengthy documents effectively. And he sees computers making fast diagnoses of ailments in hospitals. But these are a ways off. The problem is that computers often do not have enough examples to come up with reasonable judgments. From the moment we are born, von Ahn says, we begin to store countless images, sounds, smells and other perceptions from daily experiences — and immediately associate words with them. Over time, we develop a seamless ability to describe things. We call it common sense. Show a 6-year-old a picture of a boy walking a dog, and the child will instantly be able to describe it. Show the same picture to a computer, and it would not be able to describe what is happening. “Nobody bothers to teach a computer,” von Ahn says. With the ESP Game and other projects, he is trying to devise ways for humans to provide enough experiences to computers so that they can come up with common-sense judgments or descriptions.
Von Ahn believes, for example, that when computers become sufficiently adept at image interpretation they could be used in airport security screening luggage — dramatically speeding up the screening process and making it more reliable. Watching how people tap the power of Web 2.0 in the future will continue to be fascinating and it will also help speed the arrival of Web 3.0.