OpenAI thinks AI should replace human content moderators

17 Aug 2023

Image: © nicoletaionescu/

More speed and consistency are two reasons OpenAI wants GPT-4 to take over the highly stressful and overwhelming task of content moderation.

Content moderation at scale is one of tech’s most challenging problems and OpenAI believes GPT-4, its flagship generative AI model, can provide a solution.

In a blogpost published this week, OpenAI argues that using GPT-4 for content moderation can result in much faster iteration on policy changes and reduce the time taken for the process “from months to hours”.

It claims that GPT-4 can interpret rules and nuances in long content policy documentation and “adapt instantly” to policy updates, resulting in relatively higher levels of consistency when compared to the work done by human content moderators.

“We believe this offers a more positive vision of the future of digital platforms, where AI can help moderate online traffic according to platform-specific policy and relieve the mental burden of a large number of human moderators,” the company wrote on its website.

“Anyone with OpenAI API access can implement this approach to create their own AI-assisted moderation system.”

Social media companies spend millions every year on human content moderators, who often have to scan through large amounts of content and flag the ones that are inappropriate, harmful and otherwise go against company policies.

“The process is inherently slow and can lead to mental stress on human moderators,” said OpenAI of the problem, which has been gaining attention in recent years.

From exposés on the trauma these workers can experience and clean-up commitments made by the largest platforms, to legislation proposed by policymakers and legal challenges against employers, the public at large is becoming more aware of the ‘silent heroes of the internet’.

OpenAI explained that GPT-4 can also be used to trim down the process of developing and customising content policies from months to hours. This can prove especially beneficial to social media companies that need to comply with the requirements of the EU Digital Services Act.

Once a policy guideline is written, experts can create what OpenAI calls a “golden set” of data by identifying a small number of examples and assigning them labels according to the policy. GPT-4 then reads the policy and assigns labels to the same dataset without seeing the answers.

“By examining the discrepancies between GPT-4’s judgements and those of a human, the policy experts can ask GPT-4 to come up with reasoning behind its labels, analyse the ambiguity in policy definitions, resolve confusion and provide further clarification in the policy accordingly,” OpenAI added.

This, the Microsoft-backed company argues, can result in more consistent labels, faster feedback loops and even reduced mental burden for human moderators who suffer from “emotional exhaustion and psychological stress” in a way that AI cannot.

But OpenAI also warned that judgements by language models are “vulnerable to undesired biases” that might have been introduced into the model during training. This means that the results and output will need to be carefully monitored and refined by humans in the loop.

“By reducing human involvement in some parts of the moderation process that can be handled by language models, human resources can be more focused on addressing the complex edge cases most needed for policy refinement.”

10 things you need to know direct to your inbox every weekday. Sign up for the Daily Brief, Silicon Republic’s digest of essential sci-tech news.

Vish Gain is a journalist with Silicon Republic