Searching the Internet can sometimes feel like looking for a shrinking needle in an expanding haystack. To that end, a new approach to sorting information that relies on scanning documents for abrupt "bursts" in the usage of particular words may help. Jon Kleinberg of Cornell University described the technique yesterday at the annual meeting of the American Association for the Advancement of Science in Denver, Colo.

A besieged e-mail inbox prompted Kleinberg to design the new system. While trying to filter his mail, he theorized that whenever an important topic arose, keywords related to it would show up in messages with increasing frequency. As a result, searching for words whose usage increased dramatically and quickly--or "burst"--could help identify significant topics and provide a way to categorize messages. Kleinberg devised a search algorithm that analyzes both the number of times words appear and the rate of increase in their frequency over time.

To test his approach, Kleinberg used the algorithm to scrutinize the full text of all the State of the Union addresses given since 1790. The "bursts" that the program identified matched important events occurring at the time certain speeches were delivered. For instance, in the aftermath of the American Revolution, "militia" and "British" were among the flagged words, whereas "atomic" displayed substantial "burst" between 1947 and 1959. Such trends are intuitive to people, Kleinberg notes, but a computer, which lacks historical context, still successfully identified them solely by scanning raw text. He posits that the new approach could help narrow web searches by better recognizing the time context of a query. In addition, sociologists or marketers might be able to identify emerging trends more adeptly by monitoring the "burstiness" of words on weblogs or e-mails to consumer web sites.