Programs meant to detect the use of AI in content have been found to discriminate against non-native English speakers. (Unsplash)

Study reveals AI detection programs are biased against non-native English speakers

Febin MathewJuly 12, 23 Artificial Intelligence

A recent study has shed light on discrimination not only being limited to humans in society, but also extending to generative AI. The rise in popularity of generative AI, particularly with the introduction of ChatGPT, has prompted the development of AI detection programs to prevent misuse, such as cheating in exams. These programs analyze content to determine if it was authored by a human or an AI. However, these programs are now facing allegations of exhibiting alarming discrimination against individuals who are non-native English speakers.

Yes, Generative AI has been accused of bias in the past, and now a new study has revealed that its detection programs are also discriminatory.

Discrimination with an artificial intelligence recognition program

Computer programs used to detect AI involvement in papers, exams and job applications can discriminate against native English speakers, according to research led by James Zou, an assistant professor of biomedical data science at Stanford University. A study published in Cell Press screened 91 English essays written by native English speakers using 7 different programs used to detect GPT, and the conclusions may shock you.

As many as 61.3 percent of the essays originally written for the TOEFL exam were marked as created by artificial intelligence. Shockingly, one program even marked 98 percent of the essays to create an AI program.

On the other hand, the program also sent essays written by eighth graders who speak English as their mother tongue, almost 90 percent of which came back as human creations.

How do these programs work?

To detect AI involvement, these programs examine text obfuscation, which is a statistical measure of how well a generative AI model predicts text. It is considered low confusion if the LLM can easily predict the next word in the sentence. Programs like ChatGPT create less confusing content, which means it uses simpler words. Since native English speakers also tend to use simpler words, their written content is vulnerable to being mistakenly labeled as AI-generated.

The researchers said: “Therefore, practitioners should be cautious about using slight confusion as an indicator of AI-generated text, as such an approach may inadvertently exacerbate systemic biases against non-original authors in the academic community.”