THE BRAIN abhors ambiguity, yet we are curiously attracted to it. Many famous visual illusions exploit ambiguity to titillate the senses. Resolving uncertainties creates a pleasant jolt in your brain, similar to the one you experience in the “Eureka!” moment of solving a problem. Such observations led German physicist, psychologist and ophthalmologist Hermann von Helmholtz to point out that perception has a good deal in common with intellectual problem solving. More recently, the idea has been revived and championed eloquently by neuropsychologist Richard L. Gregory of the University of Bristol in England.
So-called bistable figures, such as the mother-in-law/wife (a) and faces/vase (b) illusions, are often touted in textbooks as the prime example of how top-down influences (preexisting knowledge or expectations) from higher brain centers—where such perceptual tokens as “old” and “young” are encoded—can influence perception. Laypeople often take this to mean you can see anything you want to see, but this is nonsense—although, ironically, this view contains more truth than most of our colleagues would allow.
Consider the simple case of the Necker cube (c and variation in d). You can view this illusion in one of two ways—either pointing up or pointing down. With a little practice, you can flip between these alternate percepts at will (still, it is great fun when it flips spontaneously; it feels like an amusing practical joke has been played on you). In fact, the drawing is compatible not only with two interpretations, as is commonly believed; there is actually an infinite set of trapezoidal shapes that can produce exactly the same retinal image, yet the brain homes in on a cube without hesitation. Note that at any time, you see only one or the other. The visual system appears to struggle to determine which of two cubes the drawing represents, but it has already solved the much larger perceptual problem by rejecting trillions of other configurations that could give rise to the retinal pattern we call the Necker cube. Top-down attention and will, or intent, can only help you select between two percepts; you will not see any of the other possibilities no matter how hard you try.
Although the Necker cube is often used to illustrate the role of top-down influences, it, in fact, proves the very opposite—namely, that perception is generally immune to such influences. Indeed, if all perceptual computations mainly relied on top-down effects, they would be much too slow to help you in tasks related to survival and the propagation of your genes—escaping a predator, for example, or catching a meal or a mate.
It is important to recognize that ambiguity does not arise only in cleverly contrived displays such as on these two pages and in e, in which shading could make the circles appear to be convex or concave. In truth, ambiguity is the rule rather than the exception in perception; it is usually resolved by other coexisting bottom-up (or sideways, if that is the right word) cues that exploit built-in statistical “knowledge” of the visual world. Such knowledge is wired into the neural circuitry of the visual system and deployed unconsciously to eliminate millions of false solutions. But the knowledge in question pertains to general properties of the world, not specific ones. The visual system has hardwired knowledge of surfaces, contours, depth, motion, illumination, and so on but not of umbrellas, chairs or dalmatians.
Ambiguity also arises in motion perception. In f, we begin with two light spots flashed simultaneously on diagonally opposite corners of an imaginary square, shown at 1. The lights are then switched off and replaced by spots appearing on the remaining two corners, at 2. The two frames are then cycled continuously. In this display, which we call a bistable quartet, the spots can be seen as oscillating vertically (dashed arrows) or horizontally (solid arrows) but never as both simultaneously—another example of ambiguity. It takes greater effort, but as with the cube, you can intentionally flip between these alternate percepts.
We asked ourselves what would happen if you scattered several such bistable-quartet stimuli on a computer screen. Would they all flip together when you mentally flipped one? Or, given that any one of them has a 50 percent chance of being vertical or horizontal, would each flip separately? That is, is the resolution of ambiguity global (all the quartets look the same), or does it occur piecemeal for different parts of the visual field?
The answer is clear: they all flip together. There must be global fieldlike effects in the resolution of ambiguity. You might want to try experimenting with this on your computer. You could also ask, Does the same rule apply for the mother-in-law/wife illusion? How about the Necker cube? It is remarkable how much you can learn about perception using such simple displays; it is what makes the field so seductive.
We must be careful not to say that top-down influences play no role at all. In some of the figures, you can get stuck in one interpretation but can switch once you hear, verbally, that there is an alternative interpretation. It is as if your visual system—tapping into high-level memory—“projects” a template (for example, an old or young face) onto the fragments to facilitate their perception. One could argue that the recognition of objects can benefit from top-down processes that tap into attentional selection and memory. In contrast, seeing contours, surfaces, motion and depth is mainly from the bottom up (you can “see” all the surfaces and corners of a cube and even reach out and grab it physically and yet not know or recognize it as a cube). In fact, we have both had the experience of peering at neurons all day through a microscope and then the next day “hallucinating” neurons everywhere: in trees, leaves and clouds. The extreme example of this effect is seen in patients who become completely blind and start hallucinating elves, circus animals and other objects—called the Charles Bonnet syndrome. In these individuals, only top-down inputs contribute to perception—the bottom-up processes, missing because they are blind (from macular degeneration or cataracts), can no longer limit their hallucinations. It is almost as though we are all hallucinating all the time and what we call object perception merely involves selecting the one hallucination that best matches the current sensory input, however fragmentary. Vision, in short, is controlled hallucination.
But doesn't this statement contradict what we said earlier about vision being largely bottom-up? The answer to this riddle is “vision” is not a single process; perception of objectness—its outline, surface depth, and so on, as when you see a cube as cuboid—is largely bottom-up, whereas higher-level identification and categorization of objects into neurons or umbrellas do indeed benefit enormously from top-down memory-based influences.
How and What
Physiology also supports this distinction. Signals from the eyeballs are initially processed in the primary visual cortex at the back of the brain and then diverge into two visual pathways: the “how” pathway in the parietal lobe of the brain and the “what” pathway, linked to memories, in the temporal lobes. The former is concerned with spatial vision and navigation—reaching out to grab something, avoiding obstacles and pits, dodging missiles, and so on, none of which requires that you identify the object in question. The temporal lobes, on the other hand, enable you to recognize what an object actually is (pig, woman, table), and this process probably benefits partially from memory-based top-down effects. There are hybrid cases in which they overlap. For example, with the faces/vase illusion there is a bias to get stuck seeing the faces. But you can switch to seeing the vase without explicitly being told “look for the vase,” if you are instead instructed to attend to the white region and see it as a foreground figure rather than as background.
Can the perception of ambiguous, bistable figures be biased in any way if they are preceded with other nonambiguous figures—a technique that is called priming? Priming has been explored extensively in linguistics (for instance, reading “foot” preceded by “leg” evokes the body part, but reading “foot” preceded by “inches” might suggest a ruler). Intriguingly, such priming can occur even if the first word appears too briefly to be seen consciously. Whether perception can be similarly primed has not been carefully studied. You might try it on friends.
Finally, as we noted in one of our previous columns, you can construct displays that are always ambiguous, such as the devil's pitchfork or the perpetual staircase [see “Paradoxical Perceptions,” April/May 2007]. Such paradoxical figures evoke wonder, delight and frustration at the same time—a microcosm of life itself.