When Hillary Clinton’s new book What Happened debuted on Amazon’s Web site last month, the response was incredible. So incredible, that of the 1,600 reviews posted on the book’s Amazon page in just a few hours, the company soon deleted 900 it suspected of being bogus: written by people who said they loved or hated the book, but had neither purchased nor likely even read it. Fake product reviews—prompted by payola or more nefarious motives—are nothing new, but they are set to become a bigger problem as tricksters find new ways of automating online misinformation campaigns launched to sway public opinion.

Amazon has deleted nearly 1,200 reviews of What Happened since it debuted on September 12, according to ReviewMeta, a watchdog site that analyzes consumer feedback for products sold on Amazon.com. ReviewMeta gained some notoriety last year when, after evaluating seven million appraisals across Amazon, it called out the online retailer for allowing “incentivized” reviews by people paid to write five-star product endorsements. Amazon later revised its Community Guidelines to ban incentivized reviews.

Amazon’s deletions of so many appraisals for Clinton’s book caught ReviewMeta’s attention. The site gathers publicly available data on Amazon, including the number of stars a product receives, whether the writer is a verified buyer of the product and how active that person is on the site. Tommy Noonan, a programmer who founded ReviewMeta in May 2016, refrains from calling these reviews “fake,” given how politically loaded that term has become in the past year. Noonan prefers the term “unnatural.”

“There is no way to say with 100 percent certainty that a particular review is fake,” he explains. Fortunately, only a handful of items sold on Amazon’s site have had review integrity problems comparable with What Happened. And those items were “mostly Clinton books—although there was also a problem with a [Donald] Trump Christmas ornament” that received an unusually large number of negative critiques, Noonan adds.

AI to the Rescue?

It is not as it easy as it might sound to churn out enough deceptive reviews to influence a product or service’s reputation on Amazon, Yelp or any other commerce site that relies heavily on consumer appraisals. Unlike fake news stories that someone writes and then tries to spread virally through social media, artificial reviews work only if they are manufactured in volume and posted to sites where a particular item is sold or advertised. They also need to be reasonably believable—although proper spelling and punctuation seem to be optional.

A group of University of Chicago researchers is investigating whether artificial intelligence could be used to automatically crank out bulk reviews that are convincing enough to be effective. Their latest experiment involved developing AI-based methods to generate phony Yelp restaurant evaluations. (Yelp is a popular crowdsourced Web site that has posted more than 135 million reviews covering about 2.8 million businesses since launching in July 2004). The researchers used a machine-learning technique known as deep learning to analyze letter and word patterns used in millions of existing Yelp reviews. Deep learning requires an enormous amount of computation and entails feeding vast data sets into large networks of simulated artificial “neurons” based loosely on the neural structure of the human brain. The Chicago team’s artificial neural network generated its own restaurant critiques—some with sophisticated word usage patterns that made for realistic appraisals and others that would seem easy to spot, thanks to repeated words and phrases.

But when the researchers tested their AI-generated reviews, they found that Yelp’s filtering software—which also relies on machine-learning algorithms—had difficulty spotting many of the fakes. Human test subjects asked to evaluate authentic and automated appraisals were unable to distinguish between the two. When asked to rate whether a particular review was “useful,” the humans respondents replied in the affirmative to AI-generated versions nearly as often as real ones.

“We have validated the danger of someone using AI to create fake accounts that are good enough to fool current countermeasures,” says Ben Zhao, a Chicago professor of computer science who will present the research with his colleagues next month at the ACM Conference on Computer and Communications Security in Dallas. Like Yelp, Amazon and other Web sites use filtering software to detect suspicious reviews. This software is based on machine-learning techniques similar to those the researchers developed to write their bogus evaluations. Some filtering software tracks and analyzes data about reviewers such as their computers’ identifying internet protocol (IP) addresses or how often they post. Other defensive programs examine text for recurring words as well as phrases that may have been plagiarized from other Web sites.

The researchers are not aware of any evidence AI is currently being used to game the online review system, Zhao says—but if misinformation campaigners do turn to AI, he warns, “it basically [becomes] an arms race between attacker and defender to see who can develop more sophisticated algorithms and better artificial neural networks to either generate or detect fake reviews.” For that reason, Zhao’s team is now developing algorithms that could be used as a countermeasure to detect fake reviews—similar to the ones they created. The ability to build an effective defense requires knowing a neural network’s limitations. For example, if it is designed to focus on creating content with correct grammar and vocabulary, it is more likely to overlook the fact that it is using the same words and phrases over and over. “But [searching for such flaws] is just a short-term fix because more powerful hardware and larger data for training means that future AI models will be able to capture all these properties and be truly indistinguishable from human-authored content,” Zhao says.


As AI matures it is not a stretch to suppose it will be used to corrupt online review systems that so many people turn to before opening their wallets. But for now a more common and human-based approach to generating large numbers of fake critiques—such as those for Clinton’s book—is called “crowdturfing.” In general, crowdturfing marketplaces offer payment to people willing to help attack review systems, social media and search engines. These efforts work like an “evil Mechanical Turk,” Zhao says. Amazon created a site called Mechanical Turk in 2005 to enable the crowdsourcing of work via the internet—whether for a company that pays random Web surfers to weigh in on a new logo design or a researcher who is conducting a social science experiment.

In crowdturfing online reviews, an attacker creates a project on the Mechanical Turk site and offers to pay large numbers of people to set up accounts on Amazon, Yelp, TripAdvisor or other sites and to then post reviews intended to either raise or sink a product or service’s money-making prospects. “[A company] can pay workers small amounts to write negative online reviews for a competing business, often fabricating stories of bad experiences or service,” Zhao says. Crowdturfing has become a growing problem in China, India and the U.S. but is often limited by the amount of money a person has available to get others to do the dirty work, he adds.

Automating Fake News

In anticipation of automated misinformation technology maturing to the point where it can consistently produce convincing news articles, Zhao and his colleagues are considering fake news detection as a future direction for their research. Programs already exist to automatically generate essays and scientific papers, but a careful human read usually reveals them to be nonsensical, says Filippo Menczer, a professor of informatics and computer science at the Indiana University School of Informatics and Computing.

Articles intended purely to spread falsehoods and misinformation are currently written by humans because they need to come off as authentic in order to go viral online, says Menczer, who was not involved in the Chicago research. “That is something that a machine is not capable of doing with today’s technology,” he says. “Still, skilled AI scientists putting their effort into this not-so-noble task could probably create credible articles that spread half-truths and leverage people’s fears.”