The company says humans should still be involved in the moderation process.

OpenAI Develops AI-Powered Content Moderation System Using GPT-4

Jim HendlerAugust 16, 23 Generative AI, GPT-4, OpenAI

OpenAI, the creator of ChatGPT, is addressing the longstanding challenge of content moderation on the internet. Determining what content is acceptable on a platform has always been a complex and subjective task. However, OpenAI believes that its latest model, GPT-4, can offer a solution. By leveraging the capabilities of this multimodal model, OpenAI aims to develop a content moderation system that is both adaptable and efficient, ensuring scalability and consistency.

The company wrote in a blog post that GPT-4 can not only help make content moderation decisions, but also help in policy development and rapid iteration of policy changes “shortening the cycle from months to hours.” It claims that the model can parse the various regulations and nuances of content policies and instantly adapt to any updates. OpenAI claims this results in more consistent content tagging.

“We believe this offers a more positive vision of the future of digital platforms, where AI can help manage web traffic according to platform-specific policies and ease the mental burden of many human moderators,” OpenAI’s Lilian Weng, Vik Goel and Andrea Vallone wrote. “Anyone with access to the OpenAI API can implement this approach to create their own AI-assisted moderation system.” OpenAI claims that GPT-4 monitoring tools can help companies complete about six months of work in about a day.

It is well documented that manually reviewing traumatic content can have a significant impact on the mental health of human moderators, especially when it comes to graphic material. In 2020, Meta agreed to pay more than 11,000 moderators at least $1,000 each in compensation for mental health issues that may have resulted from reviewing material posted to Facebook.

Using AI to take the burden off human reviewers can be very beneficial. For example, Meta has been using artificial intelligence to help moderators for several years. Still, OpenAI says that until now human moderators have been assisted by “smaller vertical-specific machine learning models. The process is inherently slow and can cause mental stress for human moderators.”

AI models are far from perfect. Big companies have long used AI in their moderation processes, and with or without the technology, big content decisions are still being made wrong. It remains to be seen whether OpenAI’s system can avoid the many major moderation traps we’ve seen other companies fall into over the years.

Either way, OpenAI agrees that humans still need to be involved in the process. “We have continued human evaluation to verify some of the model estimates,” Vallone, who works on OpenAI’s policy team, told Bloomberg.

“Language model estimates are subject to unwanted biases that may have been introduced into the model during training. As with any AI application, results and outputs must be carefully observed, validated and refined by involving humans,” OpenAI’s blog post reads. “By reducing human involvement in some parts of the moderation process that can be controlled by language models, human resources can be focused more on handling the complex edge cases needed to refine policy.”