Latest AI could one day take over as the biggest editor of Wikipedia

13 Feb 2020

Image: © radub85/

Researchers have developed an AI that can automatically rewrite outdated sentences on Wikipedia, drastically reducing the need for human editing.

Despite thousands of volunteer editors dedicating many hours towards keeping Wikipedia up to date, editing an estimated 52m articles seems like an almost impossible task. However, researchers from MIT are set to unveil a new AI that could be used to automatically update any inaccuracies on the online encyclopaedia, thereby giving human editors a robotic helping hand.

In a paper presented at the AAAI Conference on AI, the researchers described a text-generating system that pinpoints and replaces specific information in relevant Wikipedia sentences, while keeping the language similar to how humans write and edit.

The idea is that humans could type an unstructured sentence with the updated information into an interface, without the need to worry about grammar. The AI then searches Wikipedia for the right pages and outdated information, which it then updates in a human-like style.

The researchers are hopeful that, down the line, it could be possible to build an AI that can do the entire process automatically. This would mean it could scour the web for updated news on a topic and replace the text.

Taking on ‘fake news’

“There are so many updates constantly needed to Wikipedia articles. It would be beneficial to automatically modify exact portions of the articles, with little to no human intervention,” said Darsh Shah, a PhD student in MIT’s Computer Science and AI Laboratory, who is one of the lead authors.

“Instead of hundreds of people working on modifying each Wikipedia article, then you’ll only need a few, because the model is helping or doing it automatically. That offers dramatic improvements in efficiency.”

Looking beyond Wikipedia, the study also put forward the AI’s potential benefits as a tool to eliminate bias when training detectors of so-called ‘fake news’. Some of these detectors train on datasets of agree-disagree sentence pairs to verify a claim by matching it to given evidence.

“During training, models use some language of the human-written claims as ‘give-away’ phrases to mark them as false, without relying much on the corresponding evidence sentence,” Shah said. “This reduces the model’s accuracy when evaluating real-world examples, as it does not perform fact-checking.”

By applying their AI to the agree-disagree method of disinformation detection, an augmented dataset used by the researchers was able to reduce the error rate of a popular detector by 13pc.

Colm Gorey was a senior journalist with Silicon Republic