By Stephen Strauss of Nature magazine
The hope that swarms of gamers can help to solve difficult biological problems has been given another boost by a report in the journal PLoS One, showing that data gleaned from the online game Phylo are helping to untangle a major problem in comparative genomics.
The game was created to address the 'multiple sequence alignment (MSA) problem', which refers to the difficulty of aligning roughly similar sequences of DNA in genes common to many species. A DNA sequence that is conserved across species suggests that it plays an important role in the ultimate function of that particular gene.
Although computer algorithms can do very rough alignments of sequences across species, they have proven inept at getting the answer just right. "It is fair to say that present alignments are not just a little bit bad, they are really pretty crude because we have to take a lot of heuristic shortcuts," says Adam Siepel, a computational biologist at Cornell University in Ithaca, New York, who was not involved with the study.
That is where human gamers can make a difference. "Understanding when something breaks a general rule is very difficult for a computer but that is what human visual intelligence is very good at," says lead author Jérôme Waldispühl, a computational biologist at McGill University in Montreal, Canada.
Waldispühl and his colleagues created Phylo with humans' visual intelligence in mind, and released it online in November 2010. The aim of the game is to improve the sequence alignment of promoter regions -- which control when a gene is transcribed -- of 521 disease-associated genes from 44 vertebrate species. Sequences are represented by strings of blocks, each with a color corresponding to one of the four different bases that make up DNA. Players try to find the best possible match between sequences for up to eight different species at a time by shifting the sequences to the left or right, one block at a time.
Players who completed the game before they ran out of time had their alignments entered into Phylo's database. If a player's alignments were better than that calculated by MULTIZ, a state-of-the-art alignment program hosted by the University of California, Santa Cruz, their score was displayed in the game's hall of fame.
The authors write that after seven months, the game had attracted 12,252 registered users and nearly 3,000 regular players. The gamers produced roughly 350,000 solutions to various MSA problems, beating the accuracy of alignments from MULTIZ in roughly 70 per cent of the sequences they manipulated. "We have shown that humans' game-playing visual talents can do some things better than a computer algorithm," says Waldispühl.
Although other games have been used to solve biological problems -- such as Foldit and Folding@home for protein folding -- these games require players to understand at least some biochemistry before they start. For Phylo, "players don't have to know anything about genetics to make a contribution," says co-author Mathieu Blanchette. "It's a pure game."
It is a game that promises to be of increasing importance to geneticists, says computer scientist Saurabh Sinha at the University of Illinois in Urbana-Champaign, who works on computational approaches to problems in molecular biology. Sinha says that, with the genomes of 10,000 veretbrate species slated to be sequenced in the next 3-5 years alone, the MSA is only going to get more difficult in future. "If you have more species to align, covering a broader evolutionary span, the current algorithms don't scale up well computationally," he says.
One collateral use of Phylo that has already emerged is as an aid to demonstrating how difficult sequence alignment is becoming, says Guillaume Bourque, a geneticist at McGill who was not involved in the study. "I have been able to use it as a teaching tool to show non-experts why alignment is so hard," Bourque says.
In the near future, Phylo will be available on tablets and cell phones, the researchers say. They are also planning to allow researchers to submit DNA, RNA or protein segments from different species that they would like players to line up.