Is it possible to automate scientific discovery? I don’t mean automating experiments. I mean: Is it possible to build a machine—a robot scientist—that can discover new scientific knowledge? My colleagues and I have spent a decade trying to develop one.

We have two main motives. The first is to better understand science. As famed physicist Richard Feynman noted: “What I cannot create, I do not understand.” In this philosophy, trying to build a robot scientist forces us to make concrete engineering decisions involving the relation between abstract and physical objects and between observed and theoretical phenomena, as well as the ways hypotheses are created.

Our second motivation is technological. Robot scientists could make research more productive and cost-efficient. Some scientific problems are so complex they require a vast amount of research, and there are simply not enough human scientists to do it all; automation offers our best hope for solving those problems.

Computer technology for science has been steadily improving, including “high-throughput” laboratory automation such as DNA sequencing and drug screening. Less obvious are computers that are automating the process of data analysis and that are beginning to generate original scientific hypotheses. In chemistry, for example, machine-learning programs are helping to design drugs. The goal for a robot scientist is to combine these technologies to automate the entire scientific process: forming hypotheses, devising and carrying out experiments to test those hypotheses, interpreting the results and repeating the cycle until new knowledge is found.

The ultimate question, of course, is whether we can devise a robot scientist that can actually accomplish the entire process. The capabilities of two robots designed at our laboratory, and a handful of others around the world, suggest we can.

Adam Takes on Yeast
The pioneering work of applying artificial intelligence to scientific discovery took place at Stanford University in the 1960s and 1970s. A computer program named DENDRAL was designed to analyze mass-spectrometer data, and the related ­Meta-DENDRAL program was one of the first machine-learning systems. The researchers were trying to create automated instruments that could look for signs of life on Mars during the 1975 NASA Viking mission. Unfortunately, that task was beyond the technology of the day. Since then, programs such as Prospector (for geology) and Bacon (for general discovery) and more recent successors have automated such tasks as proposing hypotheses and experiments to test them. Yet most lack the ability to physically conduct their own experiments, which is crucial if artificial-intelligence systems are to work even semi-independently.

Our robot, Adam, is not humanoid; it is a complex, automated lab that would fill a small office cubicle [see box on opposite page]. The equipment includes a freezer, three liquid-handling robots, three robotic arms, three incubators, a centrifuge, and more, ­every piece of it automated. Of course, Adam also has a powerful computational brain—a computer that does the reasoning and controls the personal computers that operate the hardware.

Adam experiments on how microbes grow, by selecting microbial strains and growth media, then observing how the strains grow in the media over several days. The robot can initiate about 1,000 strain-media combinations a day all on its own. We designed Adam to investigate an important area of biology, one that lends itself to automation: functional genomics, which investigates the relations between genes and their functions.

The first full study was on the yeast Saccharomyces cerevisiae—the organism used to make bread, beer, wine and whiskey. Biologists are most interested in the strain as a “model” organism for understanding how human cells work. Yeast cells have far fewer genes than human cells do. The cells grow quickly and easily. And although the last common ancestor between humans and yeast existed perhaps a billion years ago, evolution is very conservative, so most of what is true for a yeast cell is also true for our cells.

Adam focused on understanding the unsolved problem of how yeast uses enzymes—complex proteins that catalyze particular biochemical reactions—to convert its growth medium into more yeast and waste products. Scientists still do not fully understand this process, although they have studied it for more than 150 years. They know of many enzymes yeast produces, but in some cases not which genes encode them. Adam set out to discover the “parental genes” that encode these “orphan” enzymes.

To be able to discover some novel science, Adam needs to know a lot of existing science. We programmed Adam with extensive background knowledge about yeast metabolism and the functional genomics of yeast. The claim that Adam holds background “knowledge” rather than information is up for philosophical debate. We argue that “knowledge” is justified because it is used by Adam to reason and guide its interactions with the physical world.

Adam uses logic statements to represent its knowledge. Logic was first devised 2,400 years ago to describe knowledge with greater precision than natural language might allow. Modern logic is the most accurate way to represent scientific knowledge and to unambiguously exchange knowledge between robots and humans. Conveniently, logic can also be used as a programming language, which enables Adam’s background to be interpreted as a computer program.

To start Adam’s investigation, we programmed it with many facts. Take a typical example: in S. cerevisiae, the gene ARO3 ­encodes an enzyme called 3-deoxy-D-arabino-heptulosonate-7-phosphate. We also gave Adam related facts, such as that this enzyme catalyzes a chemical reaction, in which the compounds phosphoenolpyruvate and D-erythrose 4-phosphate react to produce 2-dehydro-3-deoxy-D-arabino-heptonate 7-phosphate, plus phosphate.

Connected together, the facts form a model of yeast metabolism that integrates knowledge about genes, enzymes and metabolites (small chemical molecules). The difference between
a model and an encyclopedia is that a model can be converted into software that can act on data to make predictions. A robot scientist can integrate abstract scientific models with laboratory robotics to automatically test and improve the models.

Reasoning about Genes
When scientists follow the scientific method, they form hypotheses and then experimentally test the deductive consequences of those hypotheses. In this manner, Adam first hypothesizes new facts about yeast biology, then deduces the experimental consequences of the facts using its model of metabolism. Next Adam experimentally tests the consequences to see if the hypothesized facts are consistent with the observations.

The cycle begins with Adam forming hypotheses about which genes could be the parents of orphan enzymes [see box on page 76]. To focus on the most likely hypotheses, Adam used its knowledge base. As an example, one orphan enzyme it knew about was 2-aminoadipate transaminase. This enzyme catalyzes the reaction: 2-oxoadipate plus L-glutamate yields L-2-aminoadipate plus 2-oxoglutarate (the reaction also occurs in the reverse direction). This reaction is important because it is a potential target for antifungal drugs, but the parental gene is unknown. To form a hypothesis about which yeast gene could encode this enzyme, Adam first interrogated its knowledge base to see if any genes from other organisms are known to encode the enzyme. This query returned the fact that in Rattus norvegicus (the brown rat) a gene called Aadat encodes the enzyme.

Adam took the protein sequence of the enzyme encoded by the Aadat gene and examined whether any similar protein sequences are encoded in the yeast genome. Adam knows that if protein sequences are similar enough, it is reasonable to infer that the sequences are homologous—that they share a common ancestor. Adam also knows that if protein sequences are homologous, then the function of their common ancestor may have been conserved. Therefore, from similar protein sequences Adam can reason that their encoding genes may have the same function. Adam found three yeast genes with sequences similar to Aadat: YER152c, YJL060w and YJL202w. It hypothesized that these genes each encode the enzyme 2-aminoadipate transaminase.

To test its hypotheses, Adam conducted numerous physical experiments. It grew certain yeast strains selected from a complete collection in its freezer, where each strain has a specific gene removed. The robot examined the growth of three yeast strains that were missing the genes YER152c, YJL060w and YJL202w, respectively, when grown in the presence of chemicals such as L-2-aminoadipate that are involved in the reaction catalyzed by the enzyme.

The next step would be to experiment on the strains. Money for science is always limited. And often scientists race to be the first to solve a problem. We therefore designed Adam to devise efficient experiments that test hypotheses cheaply and quickly. To achieve this goal, Adam assumes that every hypothesis has a probability of being true. This assumption is controversial, and some philosophers such as Karl Popper have denied that hypotheses can have associated probabilities. Most working scientists, however, tacitly assume that certain types of hypotheses are more likely to prove true than others. For example, they generally follow the notion of “Occam’s razor”—that all else being equal, a simpler hypothesis is more probable than a complex one. Adam also considers the cost of a possible experiment, which currently is just the cost of the chemicals involved. A better approach would include the “cost” of time as well.

Given a set of hypotheses with associated probabilities and a set of possible experiments with associated costs, the goal we set for Adam is to choose a series of experiments that minimizes the expected cost of eliminating all but one hypothesis. Pursuing this approach optimally is computationally very difficult, but our analyses have shown that Adam’s approximate strategy selects experiments that solve problems more cheaply and quickly than other strategies, such as simply choosing the cheapest experiment. In some cases, Adam can design one experiment that can shed light on many hypotheses. Human scientists struggle to do the same; they tend to consider one hypothesis at a time.

20 Hypotheses, 12 Novel
Once Adam’s artificial-intelligence system homes in on the most promising experiments, Adam uses its robotics to carry them out and observe the results. Adam cannot directly observe genes or enzymes; its observations consist only of how much light shines through cultures of yeast. From these data, through a complicated chain of reasoning, Adam infers whether or not the evidence is consistent with hypotheses about genes and enzymes. Such chains of reasoning are typical of science; astronomers, for example, infer what is happening in distant galaxies from the radiation they observe in their instruments.

Deciding on the consistency of hypotheses was one of the most difficult tasks for Adam, because scientists have already discovered all the genes whose removal causes qualitative differences in yeast’s growth. Removing other genes generally produces only minor growth differences. To decide whether any of the minor differences is significant when a gene is removed, Adam uses sophisticated machine-learning techniques.

Adam generated and experimentally confirmed 20 hypotheses about which genes encode specific enzymes in yeast. Like all scientific claims, Adam’s needed to be confirmed. We therefore checked Adam’s conclusions using other sources of information not available to it and using new experiments we did with our own hands. We determined that seven of Adam’s conclusions were already known, one appeared wrong and 12 were novel to science.

As a check, our own manual experiments confirmed that three genes (YER152c, YJL060w and YJL202w) encode the enzyme 2-aminoadipate transaminase. The probable reason that the role of these genes had not previously been discovered is that the three genes encode the same enzyme, and the enzyme can catalyze a series of related reactions; a simple mapping of one gene to one enzyme function—the common scenario—was not the case here. Adam’s careful experimentation and statistical analysis were required to disentangle these complications.

Is the Robot a Scientist?
Some people object to the term “robot scientist,” pointing out, with some justification, that Adam resembles more of an assistant than an independent scientist. So is it legitimate to claim that Adam autonomously discovered new scientific knowledge? Let’s start with “autonomously.” We cannot simply set up Adam and come back several weeks later to examine its conclusions. Adam is a prototype, and its hardware and software often break down, requiring a technician. Integrating Adam’s software modules also needs to be improved so that they work together seamlessly without some human interaction. Adam’s process of hypothesizing and experimentally confirming new knowledge, however, does not depend on human intellectual or physical effort.

The term “discovered” raises an argument that dates back to the 19th century and the romantic figure of Lady Ada Lovelace. She was the daughter of the poet Lord Byron and collaborated with Charles Babbage, the first person to conceive of a general-purpose computing machine. Lady Lovelace argued: “The Analytical Engine has no pretensions to originate anything. It can do whatever we know how to order it to perform” (her italics). One hundred years later the great computer scientist Alan M. Turing proposed a counterargument by way of an analogy to children. Just as teachers do not get all the credit for their pupils’ discoveries, it would be unfair for humans to claim all the credit for the ideas of our machines. These arguments are of growing commercial importance; for example, in U.S. patent law only a “person” can “invent” something.

Finally, how novel is Adam’s science? Some of the mappings between genes and enzyme functions in S. cerevisiae that Adam has hypothesized and experimentally confirmed are certainly novel. Although this knowledge is modest, it is not trivial. In the case of the enzyme 2-aminoadipate transaminase, Adam found three separate genes that may solve a 50-year-old puzzle. Of course, some of Adam’s conclusions could be wrong; all scientific knowledge is provisional. Yet it seems unlikely that all the conclusions are wrong. Adam’s results have now been in the public domain for two years, and no one has noted any mistakes. As far as I know, scientists outside of my group have not yet tried to reproduce Adam’s results.

Another way of assessing whether Adam is a scientist is whether Adam’s approach to generating novel hypotheses is generalizable. Once Adam was off running experiments, we began developing a second robot. Eve applies the same automated cycles of research to drug screening and design, an important medical and commercial pursuit. The design lessons we learned from Adam make Eve a much more elegant system. Eve’s research is focused on malaria, schistosomiasis, sleeping sickness and Chagas disease. We are still developing Eve’s software, but the robot has already found some interesting compounds that show promise of being active against malaria.

Some researchers are applying approaches that are similar to Adam’s. Hod Lipson of Cornell University is using automated experimentation to improve the design of mobile robotics and to understand dynamic systems. Other researchers are trying to develop robot scientists for chemistry, biology and engineering.

Several groups, including my own, are looking into ways to automate quantum physics research, in particular how to control quantum processes. For example, Herschel A. Rabitz of Princeton University is investigating ways to use femtosecond (10–15) lasers to learn how to make or break targeted chemical bonds. Here the challenge is how to quickly formulate intelligent experiments.

Human Partners
If we accept that robots can be scientists, we would like to know their limits. Comparing the task of automating science with automating chess is instructive. Automating chess is essentially a solved problem. Computers play chess as well or better than the best humans and make strikingly beautiful moves. Computer mastery is possible because chess is a bounded, abstract world: 64 squares, 32 pieces. Science shares much of the abstract nature of chess, but automating science will be harder because experimentation takes place in the physical world. I expect, however, that developing robot scientists capable of performing quality science will probably be easier than developing artificial-intelligence systems that can socially interact with humans. In science it is safe to assume that the physical world is not trying to deceive you, whereas that is not true in society.

The most accomplished human chess masters now use computers to improve their game—to analyze positions and to prepare new attacks. Similarly, human and robot scientists working together, with contrasting strengths and weaknesses, could achieve more than either one could alone. Advances in computer hardware and in artificial-intelligence systems will lead to ever smarter robot scientists.

Whether these creations will ever be capable of paradigm-shifting insights or be limited to routine scientific inquiries is a key question about the future of science. Some leading scientists, such as physics Nobel laureate Philip Anderson, argue that paradigm-shifting science is so profound that it may not be accessible to automation. But another physics Nobel laureate, Frank Wilczek, has written that in 100 years the best physicist will be a machine. Time will tell who is correct.

Either way, I see a future where networks of human and robot scientists will collaborate. Scientific knowledge will be described using logic and disseminated instantaneously using the Web. The robots will gradually assume an ever greater role in the advancement of science.