Researchers trained a neural network to scrutinize high school essays and sniff out ghostwritten papers. Christopher Intagliata reports.
The English-language version of Wikipedia has almost six million articles. And if you're a cheating student, that's six million essays already written for you, footnotes and all. Except plagiarism isn't really an effective tactic—just plug the text into a search engine and game over.
But what about having a ghostwriter at a paper mill compose your final essay?
"Standard plagiarism software cannot detect this kind of cheating."
Stephan Lorenzen, a data analyst at the University of Copenhagen. In Denmark, where he's based, ghostwriting is a growing problem at high schools. So Lorenzen and his colleagues created a program called Ghostwriter that can detect the cheats.
At its core is a neural network trained and tested on 130,000 real essays from 10,000 Danish students. After reading through tens of thousands of essays labeled as being written by the same author or not, the machine taught itself to tune into the characteristics that might tip off cheating. For example, did a student's essays share the same styles of punctuation? The same spelling mistakes? Were the abbreviations the same?
By scrutinizing inconsistencies like those, Ghostwriter was able to pinpoint a cheated essay nearly 90 percent of the time. The team presented the results at the European Symposium on Artificial Neural Networks, Computational Intelligence and Machine Learning. [Magnus Stavngaard et al., Detecting Ghostwriters in High Schools]
There's one more aspect here that could help students. Your high school essays presumably get better over time as you learn to write—and the machine can detect that. "The final idea is to detect students who are at risk because their development in writing style isn't as you'd expect."
Teachers could thus give extra help to kids who really need it, while sniffing out the cheaters too.
[The above text is a transcript of this podcast.]