New AI-Based Search Engines are a “Game Changer” for Science Research

Products such as Semantic Scholar and Microsoft Academic could be a boon for scholars

A free AI-based scholarly search engine that aims to outdo Google Scholar is expanding its corpus of papers to cover some 10 million research articles in computer science and neuroscience, its creators announced on 11 November. Since its launch last year, it has been joined by several other AI-based academic search engines, most notably a relaunched effort from computing giant Microsoft.

Semantic Scholar, from the non-profit Allen Institute for Artificial Intelligence (AI2) in Seattle, Washington, unveiled its new format at the Society for Neuroscience annual meeting in San Diego. Some scientists who were given an early view of the site are impressed. “This is a game changer,” says Andrew Huberman, a neurobiologist at Stanford University, California. “It leads you through what is otherwise a pretty dense jungle of information.”

The search engine first launched in November 2015, promising to sort and rank academic papers using a more sophisticated understanding of their content and context. The popular Google Scholarhas access to about 200 million documents and can scan articles that are behind paywalls, but it searches merely by keywords. By contrast, Semantic Scholar can, for example, assess which citations to a paper are most meaningful, and rank papers by how quickly citations are rising—a measure of how ‘hot’ they are.

On supporting science journalism

If you're enjoying this article, consider supporting our award-winning journalism by subscribing. By purchasing a subscription you are helping to ensure the future of impactful stories about the discoveries and ideas shaping our world today.

When first launched, Semantic Scholar was restricted to 3 million papers in the field of computer science. Thanks in part to a collaboration with AI2’s sister organization, the Allen Institute for Brain Science, the site has now added millions more papers and new filters catering specifically for neurology and medicine; these filters enable searches based, for example, on which part of the brain part of the brain or cell type a paper investigates, which model organisms were studied and what methodologies were used. Next year, AI2 aims to index all of PubMed and expand to all the medical sciences, says chief executive Oren Etzioni.

“The one I still use the most is Google Scholar,” says Jose Manuel Gómez-Pérez, who works on semantic searching for the software company Expert System in Madrid. “But there is a lot of potential here.”

Microsoft’s revival

Semantic Scholar is not the only AI-based search engine around, however. Computing giant Microsoft quietly released its own AI scholarly search tool, Microsoft Academic, to the public this May, replacing its predecessor, Microsoft Academic Search, which the company stopped adding to in 2012.

Microsoft’s academic search algorithms and data are available for researchers through an application programming interface (API) and the Open Academic Society, a partnership between Microsoft Research, AI2 and others. “The more people working on this the better,” says Kuansan Wang, who is in charge of Microsoft's effort. He says that Semantic Scholar is going deeper into natural-language processing—that is, understanding the meaning of full sentences in papers and queries—but that Microsoft’s tool, which is powered by the semantic search capabilities of the firm's web-search engine Bing, covers more ground, with 160 million publications.

Like Semantic Scholar, Microsoft Academic provides useful (if less extensive) filters, including by author, journal or field of study. And it compiles a leaderboard of most-influential scientists in each subdiscipline. These are the people with the most ‘important’ publications in the field, judged by a recursive algorithm (freely available) that judges papers as important if they are cited by other important papers. The top neuroscientist for the past six months, according to Microsoft Academic, is Clifford Jack of the Mayo Clinic, in Rochester, Minnesota.

Other scholars say that they are impressed by Microsoft’s effort. The search engine is getting close to combining the advantages of Google Scholar’s massive scope with the more-structured results of subscription bibliometric databases such as Scopus and the Web of Science, says Anne-Wil Harzing, who studies science metrics at Middlesex University, UK, and has analysed the new product. “The Microsoft Academic phoenix is undeniably growing wings,” she says. Microsoft Research says it is working on a personalizable version—where users can sign in so that Microsoft can bring applicable new papers to their attention or notify them of citations to their own work—by early next year.

Other companies and academic institutions are also developing AI-driven software to delve more deeply into content found online. The Max Planck Institute for Informatics, based in Saarbrücken, Germany, for example, is developing an engine called DeepLife specifically for the health and life sciences. “These are research prototypes rather than sustainable long-term efforts,” says Etzioni.

In the long term, AI2 aims to create a system that will answer science questions, propose new experimental designs or throw up useful hypotheses. “In 20 years’ time, AI will be able to read—and more importantly, understand—scientific text,” Etzioni says.

This article is reproduced with permission and was first published on November 11, 2016.

It’s Time to Stand Up for Science

If you enjoyed this article, I’d like to ask for your support. Scientific American has served as an advocate for science and industry for 180 years, and right now may be the most critical moment in that two-century history.

I’ve been a Scientific American subscriber since I was 12 years old, and it helped shape the way I look at the world. SciAm always educates and delights me, and inspires a sense of awe for our vast, beautiful universe. I hope it does that for you, too.

If you subscribe to Scientific American, you help ensure that our coverage is centered on meaningful research and discovery; that we have the resources to report on the decisions that threaten labs across the U.S.; and that we support both budding and working scientists at a time when the value of science itself too often goes unrecognized.

In return, you get essential news, captivating podcasts, brilliant infographics, can't-miss newsletters, must-watch videos, challenging games, and the science world's best writing and reporting. You can even gift someone a subscription.

There has never been a more important time for us to stand up and show why science matters. I hope you’ll support us in that mission.

Thank you,

David M. Ewalt, Editor in Chief, Scientific American