Humans enjoy stereoscopic vision. As we mentioned in our essay last issue, because our eyes are separated horizontally images we see in the two eyes are slightly different and the difference is proportional to the relative depth. The visual areas in the brain measure these differences, and we experience the result as stereo—what we all have enjoyed as children playing with View-Master toys.
Visual-image processing from the eye to the brain happens in stages. Rudimentary features such as the orientation of edges, direction of motion, color, and so on are extracted early on in areas called V1 and V2 before reaching the next stages in the visual-processing hierarchy for a progressively more refined analysis. This stage-by-stage description is a caricature; many pathways go “back” from stage to stage—allowing the brain to play a kind of 20-questions game to arrive at a solution after successive iterations.
Returning to the concept of stereo, we can ask: At what stage is the comparison of the two eyes’ images made? If you are looking at a scene with hundreds of features, how do you know which feature in one eye matches with which feature in the other eye? How do you avoid false matches? Until the correct matching is achieved, you cannot measure differences. In stereopsis, this conundrum is called the correspondence problem.
Questions about Boundaries
To address this issue, the great 19th-century German physicist, ophthalmologist and physiologist Hermann von Helmholtz asked: Is the comparison done very early, before object boundaries are recognized, or does the brain first separately extract contours in each eye before comparing them? He concluded, without a great deal of evidence, that form perception of outlines in each eye occurs prior to interocular comparison. “Monocular form perception precedes stereopsis,” he said, arguing that the task of comparing the images in the two eyes is horrendously complex and happens very high up. The brain solves the correspondence problem by initially recognizing forms and then comparing the extended outlines of the forms. This strategy allows the brain to avoid (or minimize) false matches.
This idea was challenged nearly 100 years later by the late Hungarian scientist Béla Julesz, a non-self-effacing man of unparalleled genius, while working at Bell Labs. He employed a different stereogram, using computer-generated random-dot patterns rather than photographs or line drawings. In neither the left nor right eye image is there any recognizable contour or form—at all. Although these are made using a computer, the principle can be understood by using a digital camera and random-dot images. Begin with a random-dot pattern about five square centimeters in size. Use a pair of scissors to cut out a one- by one-square-centimeter patch from another random-dot pattern (call it S, for square). Center this square atop the first pattern and take a photo to produce the left eye’s image (L). If S is correctly positioned, it becomes virtually invisible because of camouflage from background dots. Now, slightly shift S horizontally to the right (making sure to position it so that no boundary of overlapping dots is seen from the small square). Take another picture to make the right eye’s image, R.
Julesz presented just one image from his random-dot stereogram to each eye and was astonished to see a small square float out so vividly that he was almost tempted to grab it, even though no square is visible in either eye. The original experiment was done with digitally generated pixels rather than bits of paper, and the shift was also exactly digital. So it is not as if there is a square hidden in each eye’s image; mathematically, it does not even exist in either eye alone. It is defined exclusively by the difference—the horizontal shift of S. Julesz concluded that von Helmholtz was wrong. Because the square emerges only as a result of stereoscopic fusion, stereo matching must be a point-to-point (or pixel-to-pixel) measurement of displacement, and the outline of the square emerges solely from this comparison. Stereo precedes detection of form (“form” being used interchangeably with extended outlines and boundaries in this context).