(Reuters) – ChatGPT maker OpenAI said on Thursday it plans to work with organizations to produce public and private datasets for training artificial intelligence models.
The popular chatbot ChatGPT, which can generate poetry and prose from simple prompts, is based on large language models trained entirely on open source data available on the Internet.
The company’s latest effort may help it produce more nuanced training data that’s more conversational.
“We specifically look for data that expresses human intent, in any language, subject, and format,” the company said in a blog post.
OpenAI said it is looking for partners to help it create an open-source dataset for training language models. This data set would be public for anyone to use for AI model training, it said.
The company also said it is preparing private datasets for training its own artificial intelligence models.