The study found that all models tested made mistakes in differentiating between meaningful and gibberish sentences. (Pexels)

Study finds AI models have difficulty recognizing gibberish

Febin MathewSeptember 15, 23 Artificial Intelligence

According to a study published on Thursday, chatbots and other applications powered by AI models still struggle to differentiate between nonsensical content and natural language.

Researchers at Columbia University in the US said their work revealed the limitations of current AI models and suggested it was too early to unleash them in a legal or medical setting.

They went through nine AI models, fired hundreds of pairs of sentences at them, and asked which ones were most likely to be used in everyday speech.

They asked 100 people to make the same judgment about pairs of sentences such as: “The buyer can also own the genuine product / Walk the perimeter of the surrounding high school.”

A study published in the journal Nature Machine Intelligence compared AI responses to human responses and found dramatic differences.

Advanced models, such as GPT-2, an earlier version of the model that uses the viral chatbot ChatGPT, tended to match human responses.

Other simpler models fared less well.

But the researchers stressed that all models made mistakes.

“Each model had blind spots that labeled some sentences as meaningful that human participants thought were gibberish,” said psychology professor Christopher Baldassano, an author of the report.

“That should give us pause about the extent to which we want AI systems to make important decisions, at least for now.”

Tal Golan, one of the paper’s authors, told AFP that the models were “an exciting technology that can dramatically supplement human productivity”.

However, he argued that “it is premature to allow these models to replace human decision-making in fields such as law, medicine or student assessment”.

Among the pitfalls, he said, was the possibility that people could intentionally exploit blind spots to manipulate models.

AI models burst into the public consciousness with the release of ChatGPT last year, which has since been recognized for passing several tests and touted as a potential assistant for doctors, lawyers and other professionals.