Unless you have been deaf and blind to the world over the past decade, you know that functional magnetic resonance brain imaging (fMRI) can look inside the skull of volunteers lying still inside the claustrophobic, coffinlike confines of a loud, banging magnetic scanner. The technique relies on a fortuitous property of the blood supply to reveal regional activity. Active synapses and neurons consume power and therefore need more oxygen, which is delivered by the hemoglobin molecules inside the circulating red blood cells. When these molecules give off their oxygen to the surrounding tissue, they not only change color—from arterial red to venous blue—but also turn slightly magnetic.
Activity in neural tissue causes an increase in the volume and flow of fresh blood. This change in the blood supply, called the hemodynamic signal, is tracked by sending radio waves into the skull and carefully listening to their return echoes. FMRI does not directly measure synaptic and neuronal activity, which occurs over the course of milliseconds; instead it uses a relatively sluggish proxy—changes in the blood supply—that rises and falls in seconds. The spatial resolution of fMRI is currently limited to a volume element (voxel) the size of a pea, encompassing about one million nerve cells.
Neuroscientists routinely exploit fMRI to infer what volunteers are seeing, imagining or intending to do. It is really a primitive form of mind reading. Now a team has taken that reading to a new, startling level.
A number of groups have deduced the identity of pictures viewed by volunteers while lying in the magnet scanner from the slew of maplike representations found in primary, secondary and higher-order visual cortical regions underneath the bump on the back of the head.
Jack L. Gallant of the University of California, Berkeley, is the acknowledged master of these techniques, which proceed in two stages. First, a volunteer looks at a couple of thousand images while lying in a magnet. The response of a few hundred voxels in the visual cortex to each image is carefully registered. These data are then used to train an algorithm to predict the magnitude of the fMRI response for each voxel. Second, this procedure is inverted. That is, for a given magnitude of hemodynamic response, a probabilistic technique called Bayesian decoding infers the most likely image that gave rise to the observed response in that particular volunteer (human brains differ substantially, so it is difficult to use one brain to predict the responses of another).
The best of these techniques exploit preexisting, or prior, knowledge about pictures that could have been seen before. The number of mathematically possible images is vast, but the types of actual scenes that are encountered in a world populated by people, animals, trees, buildings and other objects encompass a tiny fraction of all possible images. Appropriately enough, the images that we usually encounter are called natural images. Using a database of six million natural images, Gallant’s group showed in 2009 how brain responses of volunteers to photographs they had not previously encountered could be reconstructed.
From Images to Movies
These reconstructions are surprisingly good, even though they are based on the smudged activity of hundreds of thousands of highly diverse nerve cells, each one firing to different aspects of the image—its local intensity, color, shading, texture, and so on. A further limitation I have already alluded to is the 1,000-fold mismatch between the celerity of neuronal signals and the sedate pace at which the fMRI signal rises and falls.
Yet Gallant’s group fearlessly pushed on and applied Bayesian reconstruction techniques to the conceptually and computationally much more demanding problem of spatiotemporal reconstruction.
Three members of the group each watched about two hours’ worth of short takes from various Hollywood movies. These data were used to train a separate encoding model for each voxel. The first part of the model consisted of a bank of neural filters. These filters are based on the cumulative research that has been conducted over two decades into the way nerve cells in the visual cortex in people and monkeys respond to seeing visual stimuli with varying positions, size, motion and speed. The second part of the model coupled these neuronal filters to the blood vasculature, describing how the neuronal activity is reflected in much slower fMRI signals.