News Blog

Jan 26, 2009 04:52 PM in Basic Science | 3 comments

Can a Google algorithm identify the best scientific research?

By John Matson

 
e-mail print comment

How can one quantify the importance of a given scientific paper? One simple and frequently utilized measure is the number of times that paper is cited in subsequent publications. But critics note that counting citations favors disciplines such as biology, where papers tend to be cited more, over fields such as mathematics, where citations are less frequent. In addition, a citation from a relatively marginal paper counts just the same as a citation from a leading researcher publishing in a marquee journal.

In a study published in October in the Journal of Neuroscience and recently made available at the online repository arxiv.org, physicists Sergei Maslov of Brookhaven National Laboratory and Sidney Redner of Boston University examine the value of Google's PageRank algorithm as it applies to ranking scientific works. (The researchers rightly point out that no quantitative system can truly "value" a scientific work—but since many such metrics are already in use, it stands to reason that they should be improved.)

Instead of hyperlinks, their version of PageRank takes journal citations as the fundamental link of a hypothetical network. By traversing this network randomly from node to node, PageRank gives a higher rank to papers that are better connected—that is, papers cited by papers that are, in turn, frequently cited. It also accounts for the varying citation habits of different disciplines—a paper that cites a small number of predecessors confers more value on its citations in the PageRank algorithm than does a paper that cites dozens of past works.

The physicists make one modification to the Google algorithm: boosting the "boredom factor" at which a hypothetical network surfer is presumed to give up: whereas a Web user might follow a chain of six hyperlinks before moving on, a researcher digging in the scientific literature might go back only two steps.

Henry Small, chief scientist at the scientific business of Thomson Reuters, a major provider of data on peer-reviewed publishing, says that the PageRank approach is one of many that have emerged recently as increasingly more information becomes available online.

"The field has literally exploded in the last five years or so," he says. And although standard approaches such as citation counts may be less-than-perfect measures of scientific influence, he notes, "there's a trade-off for all of these."

Applied to the physics journals of the American Physical Society, including the prestigious Physical Review Letters, PageRank turns up a veritable who's who of physics luminaries, including a suite of Nobelists. The problem, as Small points out, is that PageRank has a certain "time-lag effect." While the algorithm is "good for identifying classic papers," he says, it can take years for an important paper to develop a broad enough network of citation links to leap into the upper PageRank echelon.

Maslov and Redner correct for this effect by introducing an exponential preference for more recent papers—interestingly, this factor seems to align the results more closely with traditional citation counts. You can test their adjusted algorithm, dubbed CiteRank, on Maslov's Web page.

Photo ©iStockphoto.com/Felix Möckel

Read More About: CiteRank, PageRank, modeling, quantifying scientific influence, mesur project, predicting nobel prizes, network theory

Share
Propeller    Digg!  Reddit delicious  Fark 
Slashdot    RT @sciam Can a Google algorithm identify the best scientific research?Twitter Review it on NewsTrust 
sharebar end

Discuss This Article


Click here to submit your comment.

VIEW:

2,573 characters remaining
 
  Email me when someone responds to this discussion.
 

risk free issuefree gift

Sciam - cover Email:
Name:
Address:
Address 2:
City:
State:  
spacer



Most Popular Blog Posts


Editor's Pick

  • Adapting to the Freshwater CrisisForward-thinking experts are getting a better handle on the growing global water shortage and coming up with innovative approaches to ensuring the security, safety and sustainability of this resource

Newsletter

Basic Science Newsletter

Get weekly coverage delivered to your inbox


 Podcasts

  • 60-Second Earth     RSS  · iTunes The Jellyfish Menace
    click to enable

    Download

  • 60-Second Science     RSS  · iTunes Plants Share Light If Neighbor Is Related
    click to enable

    Download





ADVERTISEMENT
 
 


Also on Scientific American


© 1996-2009 Scientific American Inc. All Rights Reserved. Reproduction in whole or in part without permission is prohibited.
ADVERTISEMENT