When the manuscript crossed his desk, Joshua Plotkin, a theoretical biologist at the University of Pennsylvania, was immediately intrigued. The physicist Freeman Dyson and the computer scientist William Press, both highly accomplished in their fields, had found a new solution to a famous, decades-old game theory scenario called the prisoner’s dilemma, in which players must decide whether to cheat or cooperate with a partner. The prisoner’s dilemma has long been used to help explain how cooperation might endure in nature. After all, natural selection is ruled by the survival of the fittest, so one might expect that selfish strategies benefiting the individual would be most likely to persist. But careful study of the prisoner’s dilemma revealed that organisms could act entirely in their own self-interest and still create a cooperative community.
Press and Dyson’s new solution to the problem, however, threw that rosy perspective into question. It suggested the best strategies were selfish ones that led to extortion, not cooperation.
Plotkin found the duo’s math remarkable in its elegance. But the outcome troubled him. Nature includes numerous examples of cooperative behavior. For example, vampire bats donate some of their blood meal to community members that fail to find prey. Some species of birds and social insects routinely help raise another’s brood. Even bacteria can cooperate, sticking to each other so that some may survive poison. If extortion reigns, what drives these and other acts of selflessness?
Press and Dyson’s paper looked at a classic game theory scenario—a pair of players engaged in repeated confrontation. Plotkin wanted to know if generosity could be revived if the same math was applied to a situation that more closely resembled nature. So he recast their approach in a population, allowing individuals to play a series of games with every other member of their group. The outcome of his experiments, the most recent of which was published in December in the Proceedings of the National Academy of Sciences, suggests that generosity and selfishness walk a precarious line. In some cases, cooperation triumphs. But shift just one variable, and extortion takes over once again. “We now have a very general explanation for when cooperation is expected, or not expected, to evolve in populations,” said Plotkin, who conducted the research along with his colleague Alexander Stewart.
The work is entirely theoretical at this point. But the findings could potentially have broad-reaching implications, explaining phenomena ranging from cooperation among complex organisms to the evolution of multicellularity—a form of cooperation among individual cells.
Plotkin and others say that Press and Dyson’s work could provide a new framework for studying the evolution of cooperation using game theory, allowing researchers to tease out the parameters that permit cooperation to exist. “It has basically revived this field,” said Martin Nowak, a biologist and mathematician at Harvard University.
Tit for tat
Vervet monkeys are known for their alarm calls. A monkey will scream to warn its neighbors when a predator is nearby. But in doing so, it draws dangerous attention to itself. Scientists going back to Darwin have struggled to explain how this kind of altruistic behavior evolved. If a high enough percentage of screaming monkeys gets picked off by predators, natural selection would be expected to snuff out the screamers in the gene pool. Yet it does not, and speculation as to why has led to decades of (sometimes heated) debate.
Researchers have proposed different possible mechanisms to explain cooperation. Kin selection suggests that helping family members ultimately helps the individual. Group selection proposes that cooperative groups may be more likely to survive than uncooperative ones. And direct reciprocity posits that individuals benefit from helping someone who has helped them in the past.
The prisoner’s dilemma helps researchers understand the simple strategies, such as cooperating with generous community members and cheating the cheaters, that can create a cooperative society under the right conditions. First described in the 1950s, the classic prisoner’s dilemma involves a pair of felons who are arrested and placed in separate rooms. Each is given a choice: confess or stay silent. In the best outcome, both say nothing and go free. But since neither knows what the other will do, keeping quiet is risky. If one snitches and the other stays silent, the rat gets a lighter sentence while the quiet partner suffers.
Even simple organisms, such as microbes, engage in these types of games. Some marine microorganisms produce molecules that help them gather iron, a vital nutrient. Microbial colonies often have both producers and cheaters—microbes that don't make the compound themselves, but exploit their neighbors’ molecules.
In a single instance of the prisoner’s dilemma, the best strategy is to defect—squeal on your partner and you’ll get less time. But if the game repeats over and over, the optimal strategy changes. In a single encounter, a vervet monkey that spots a predator is safer if it stays silent. But over the course of a lifetime, the monkey is more likely to survive if it warns its neighbors of impending danger and they do the same. “Each player has the incentive to defect, but overall they will do better if they cooperate,” Plotkin said. “It’s a classic problem for how cooperation can emerge.”
In the 1970s, Robert Axelrod, a political scientist at the University of Michigan, launched a round-robin tournament pitting different strategies against each other. To the surprise of many contenders, the simplest approach won. Simply mimicking the other player’s previous move, a strategy called tit for tat, triumphed over much more sophisticated programs.
Tit-for-tat strategies can be found across the biological world. Pairs of stickleback fish, for example, scout nearby predators in a sort of tit-for-tat duet. If one fish makes the risky move of darting ahead, the other reciprocates with a similar act of bravery. If one hangs back, hoping to let its partner take the risk, the partner also drops back.
Over the last 30 years, scientists have explored more evolutionarily realistic versions of the prisoner’s dilemma than Axelrod's simple version. Players in a large round-robin tournament start with a varied set of strategies—think of this as their genetically determined fitness. To mimic survival of the fittest, the winner of each interaction begets more offspring, which inherit the same strategy as their parent. The most successful strategies thus grow in popularity over time.
The winning approach depends on a variety of factors, including the size of the group, which strategies are present at the start, and how often players make mistakes. Indeed, adding noise to the game—a random change in strategy that acts as a stand-in for genetic mutation—ends the reign of tit for tat. Under these circumstances, a variant known as generous tit for tat, which involves occasionally forgiving another’s betrayal, triumphs.
The overall flavor of these simulations is optimistic—kindness pays. “The most successful strategies often tend to be the ones that don’t try to take advantage of another person,” Nowak said.
Enter Press and Dyson with a dark dose of despair.
Press and Dyson outlined an approach, dubbed extortion, in which one player could always win by choosing to defect according to a prescribed set of probabilities. Press and Dyson’s strategy is remarkable in that it allows one player to control the outcome of the game. “The main innovation is to calculate how often you can defect without demotivating your co-player completely,” said Christian Hilbe, a researcher in Nowak’s group at Harvard. Moreover, the winning player need only remember one previous move, but the strategy works just as well as those that incorporate many previous rounds of play.
The second player is forced to cooperate with the extortionist because that’s the option that provides the best payoff. “If I’m an extortionist, once in a while I’ll defect even though we cooperated, in precisely enough proportion that no matter what you do, I’ll have a higher payoff than you,” Plotkin said. The situation is reminiscent of a group project in junior high school. If one member of the team slacks off, the conscientious students have no choice but to work harder in order to earn a good grade.
Press and Dyson’s original paper was set in a classical game theory context—a series of interactions between a single pair of players. But Plotkin and Stewart wanted to know what would happen if they applied the same mathematical approach to an evolving group, such as vervet monkeys or vampire bats, who breed and survive based on their individual fitness. They explored the broader class of successful strategies, called zero-determinant strategies, that Press and Dyson had identified.
This class of strategies includes the moral opposite of extortion: generosity. In general, a player employing a generous strategy will always cooperate when his or her opponent does. If the opponent defects, the first player will still cooperate with a certain probability in an attempt to coax the opponent back to generosity.
To Plotkin and Stewart’s relief, generous strategies rather than the extortive ones were most successful when applied to evolving populations. “We found a much rosier picture,” said Plotkin, who published the results in 2013 in the Proceedings of the National Academy of Sciences. “The most robust strategies, the ones that can’t be replaced by other strategies, are generous.”
The basic intuition is simple. “Extortion does well with one opponent,” Plotkin said. “But in a large population, an extortioner will eventually pair up with another extortioner.” Then both will defect, getting a poorer payoff. “Plotkin improved our model by turning it upside down,” Dyson said. “If you want someone to cooperate with you, it’s better to bribe the person with short-range benefits than punishing him right away.”
Hilbe confirmed these findings in a real-world scenario, pitting human players against computers using either generous or extortionist strategies. As predicted, people won larger payouts when playing against generous computers than against selfish ones. But people also tended to punish extortionist opponents, refusing to cooperate even though it would be in their best interest to do so. That in turn reduced the payoff for both human player and computer. In the end, the generous computer won a larger payout than the extortionist computer.
The extortionist’s revenge
Given these outcomes, Plotkin hoped extortionists could be kept at bay. But that optimism was short-lived. Following his 2013 study, Plotkin changed the payoffs to be won by cooperating or defecting. Players passed both their strategy and the strategic payoffs to their offspring; both quantities might suffer random mutations.
With this shake-up to the system, which might correspond to a change in environmental conditions, the outcome returned to the dark side. Generosity was no longer the favored solution. "As mutations that increase the temptation to defect sweep through the group, the population reaches a tipping point,” Plotkin said. “The temptation to defect is overwhelming, and defection rules the day.”
Plotkin said the outcome was unexpected. “It's surprising because it’s within the same framework—game theory—that people have used to explain cooperation,” he said. “I thought that even if you allowed the game to evolve, cooperation would still prevail.”
The takeaway is that small tweaks to the conditions can have a major effect on whether cooperation or extortion triumphs. “It’s quite neat to see that this leads to qualitatively different outcomes,” said Jeff Gore, a biophysicist at the Massachusetts Institute of Technology who wasn't involved in the study. “Depending on the constraints, you can evolve qualitatively different kinds of games.”
Chris Adami, a computational biologist at Michigan State University, contends that there is no such thing as an optimal strategy—the winner depends on the conditions.
Indeed, Plotkin’s study is unlikely to be the end of the story. “I’m sure there will be people who look at how the result depends on the assumptions,” Hilbe said. “Perhaps cooperation can somehow be rescued.”
The prisoner’s future
The prisoner’s dilemma is obviously a highly simplified version of real interactions.
So how good a model is it for studying the evolution of cooperation? Dyson isn’t optimistic. He likes Plotkin’s and Hilbe’s studies, but mostly because they involve interesting mathematics. “Certainly as a description of possible worlds it’s quite interesting, but it doesn't look to me like the world of biology,” Dyson said.
Ethan Akin, a mathematician who has explored strategies similar to Press and Dyson’s, said he thinks the results are more applicable to sociological decision making than to the evolution of cooperation.
But some experimental biologists disagree, saying that both the prisoner’s dilemma and game theory more broadly have had a profound effect on their field. “I think that the contribution of game theory to microbial cooperation is huge,” said Will Ratcliff, an evolutionary biologist at the Georgia Institute of Technology.
For example, scientists studying antibiotic resistance are using a game theory scenario called the snowdrift game, in which a player always benefits from cooperating. (If you’re stuck in your apartment building after a blizzard, you benefit by shoveling the driveway, but so does everyone else who lives there and doesn’t shovel.) Some bacteria can produce and secrete an enzyme capable of deactivating antibiotic drugs. The enzyme is costly to produce, and lazy bacteria that don't make it can benefit by using enzymes produced by their more industrious neighbors. In a strict prisoner’s dilemma scenario, the slackers would eventually kill off the producers, harming the entire population. But in the snowdrift game, the producers have greater access to the enzyme, thus improving their fitness, and the two types of bacteria can coexist.
Microbes in the lab can mimic game theory scenarios, but whether these controlled environments accurately reflect what’s happening in nature is another story. “We set the dynamics of the game by assuming a certain kind of ecology,” Ratcliff said. But those parameters might not mirror the microbe’s normal habitat. “To show that the dynamics of an experiment conform to prisoner’s dilemma or other games doesn’t necessarily mean those mechanisms drive them in nature,” Ratcliff said.
In the iterated prisoner’s dilemma, two players compete against each other in a series of rounds. Researchers can then determine which strategy is most successful in the long run. Below, the player in the left column employs a generous strategy, attempting to entice its opponent into helping by sometimes helping even when the opponent defects. The selfish player on the right tends to defect, only helping often enough to prevent its opponent from permanent defection. Each round is scored by using a matrix like the bat example above:
In a head-to-head match, the selfish strategy defeats the generous one. Yet the same strategies have different outcomes when applied to a more evolutionarily realistic setting. In the video below, a population of players engages in a series of head-to-head encounters much like a round-robin tournament. The player that “wins” each encounter begets more offspring that employ similar strategies. Here, a single player that employs a generous strategy will tend to spread its strategy through the population:
Ultimately the entire population converts from selfish to generous strategies. Biologists use models like this to explain how cooperative behavior persists in the wild.
Reprinted with permission from Quanta Magazine, an editorially independent division of SimonsFoundation.org whose mission is to enhance public understanding of science by covering research developments and trends in mathematics and the physical and life sciences.