Word 'Bursts' Could Help Refine Web Searches

Join Our Community of Science Lovers!


On supporting science journalism

If you're enjoying this article, consider supporting our award-winning journalism by subscribing. By purchasing a subscription you are helping to ensure the future of impactful stories about the discoveries and ideas shaping our world today.


Searching the Internet can sometimes feel like looking for a shrinking needle in an expanding haystack. To that end, a new approach to sorting information that relies on scanning documents for abrupt "bursts" in the usage of particular words may help. Jon Kleinberg of Cornell University described the technique yesterday at the annual meeting of the American Association for the Advancement of Science in Denver, Colo.

A besieged e-mail inbox prompted Kleinberg to design the new system. While trying to filter his mail, he theorized that whenever an important topic arose, keywords related to it would show up in messages with increasing frequency. As a result, searching for words whose usage increased dramatically and quickly--or "burst"--could help identify significant topics and provide a way to categorize messages. Kleinberg devised a search algorithm that analyzes both the number of times words appear and the rate of increase in their frequency over time.

To test his approach, Kleinberg used the algorithm to scrutinize the full text of all the State of the Union addresses given since 1790. The "bursts" that the program identified matched important events occurring at the time certain speeches were delivered. For instance, in the aftermath of the American Revolution, "militia" and "British" were among the flagged words, whereas "atomic" displayed substantial "burst" between 1947 and 1959. Such trends are intuitive to people, Kleinberg notes, but a computer, which lacks historical context, still successfully identified them solely by scanning raw text. He posits that the new approach could help narrow web searches by better recognizing the time context of a query. In addition, sociologists or marketers might be able to identify emerging trends more adeptly by monitoring the "burstiness" of words on weblogs or e-mails to consumer web sites.

It’s Time to Stand Up for Science

If you enjoyed this article, I’d like to ask for your support. Scientific American has served as an advocate for science and industry for 180 years, and right now may be the most critical moment in that two-century history.

I’ve been a Scientific American subscriber since I was 12 years old, and it helped shape the way I look at the world. SciAm always educates and delights me, and inspires a sense of awe for our vast, beautiful universe. I hope it does that for you, too.

If you subscribe to Scientific American, you help ensure that our coverage is centered on meaningful research and discovery; that we have the resources to report on the decisions that threaten labs across the U.S.; and that we support both budding and working scientists at a time when the value of science itself too often goes unrecognized.

In return, you get essential news, captivating podcasts, brilliant infographics, can't-miss newsletters, must-watch videos, challenging games, and the science world's best writing and reporting. You can even gift someone a subscription.

There has never been a more important time for us to stand up and show why science matters. I hope you’ll support us in that mission.

Thank you,

David M. Ewalt, Editor in Chief, Scientific American

Subscribe