Wearied by trying to stay abreast of a deluge of minutely detailed information, molecular biologist Russ B. Altman of Stanford University is developing ways to distill all the essential content of some 300 research reports in his subspecialty and represent it in a computer-readable form that specialized software will be able to interrogate in complex ways. Other researchers around the world will have access to that knowledge base and will be able to use the specialized search software via the World Wide Web.
Altman is including in his knowledge base not just data from the "results" sections of the reports but also information about how experiments were done and about the facts and principles the experimenters employed in interpreting them. Altman hopes once he gets his knowledge base off to a good start, other researchers worldwide will be willing to follow his lead: they, too, will encode their research publications (using programs that he is developing) and add them to the knowledge base.
Researchers should gain several advantages from such a system. They could, for example, set up search programs that would alert them whenever anyone inputs new research that impinges directly on their subject of interest. Other functions could instantly alert investigators to conflicts between their findings and those previously reported elsewhere--even if the contradictory studies were done decades ago.
Most important, Altman maintains that his knowledge base will actually help workers generate and test hypotheses, because it will be able to evaluate quickly whether a fresh idea is consistent with everything known so far. "You can imagine writing a query that says, 'Give me all the reported values of this number that embody assumption X, " Altman told a crowd of physician-researchers at a recent gathering at the Institute of Medicine in Washington, D.C. Altman's attempt to represent scientific knowledge draws on ideas that originated in artificial intelligence but have not seen wide use in science or everyday life. "Scientists won't be replaced," he claims, "but they will be augmented."
Turning a written research report into something that a computer can sort and organize is not easy. Altman's first attempt to render a published paper into machine-analyzable form took two weeks, but as the concepts and relationships in the knowledge base have accumulated, the amount of work needed to represent successive papers has dropped dramatically. His goal is to produce user-friendly programs that enable a scientist to enter the contents of his or her latest paper in less than an hour. The research contained therein could then be analyzed, in conjunction with all other relevant findings, by any investigators having a computer and a modem. Altman has recently been awarded grants by the National Library of Medicine and by the National Science Foundation to develop his system.
Riboweb is just one among many imaginative research projects springing up on the Web, which is rapidly evolving into a collaborative tool with functions that extend far beyond simply publishing data. The Molecular Science Exploratorium at the San Diego Supercomputing Center (SDSC), for example, allows researchers to examine detailed graphic renderings of molecular structures; the users can then follow a hyperlink that leads straight to the research reports that produced those images. The SDSC has also set up an experimental "Computational Molecular Biology Project Area" on the Web. And Claudine Medigue of the Pasteur Institute in Paris has created an expert system that assists researchers in utilizing computer programs to analyze genetic sequences.
Altman believes computer representation of research reports will be especially useful in mature fields that have well-defined concepts and analytical techniques. In these fields, investigators pushing into new areas of research should be able to encode their results relatively easily. His experience suggests to him that areas such as combinatorial chemistry and clinical trials, as well as much of molecular biology, would be especially well suited to computer representation.
Other scientists are watching Altman's project with interest. Harry F. Noller, Jr., of the University of California at Santa Cruz, a ribosome biologist who occasionally collaborates with Altman, suggests the success of the scheme will depend crucially on how easy it becomes to encode a research report. "By the time you've finished a paper you're about ready for the oxygen tent," he notes; investigators will be unwilling to invest much additional effort to feed their findings into a persnickety computer.
Another major question concerns who will be responsible for maintaining the computer-based records of scientific endeavor. The Swiss government caused an uproar among biomedical researchers recently when it decided it no longer wanted to support financially a protein database, Swiss-Prot, that had originated as a research project. Altman acknowledges he will not be able to run Riboweb forever by himself. He hopes academic publishers might step in to manage it and establish other scientific knowledge bases. The companies' profits could derive from small fees that they could charge their scientist customers. Although most firms are still struggling to find ways of building sustainable--read "profitable"--Web-based businesses, some large academic publishers have expressed interest in Riboweb, Altman says.
Many established journals have already taken the leap onto the Web. Perhaps science will in time slip completely into cyberspace, liberating investigators and trees from the hard-copy legacy of ever longer shelves groaning under the weight of ever more detailed and dusty tomes.