New York Times Takes the First Step in the Copyright Battle Against OpenAI and Microsoft
The New York Times has sued OpenAI, the creator of ChatGPT, and Microsoft for copyright infringement. As the use of generative AI tools like ChatGPT often reaches new heights, the big question that remains unanswered – how were generative AI tools created or rather trained?
Creators like OpenAI and investors like Microsoft are unwilling to provide a clear understanding of how content was acquired to train large language models over the years. But The New York Times claims to have found the tip of the iceberg. In the lawsuit, The New York Times has said that OpenAI has “copied” articles, reports, in-depth studies, opinion pieces, reviews and tutorials, among other content, to train large language models (LLMs) that use ChatGPT and chatbots. like Bing Chat without prior permission or payment.
ChatGPT is a revolutionary technology. There is no doubt about it, at least in 2023. After this hype, OpenAI quickly made subscription plans and started taking money from users. Microsoft, as always, seized the market opportunity and invested $10 billion. If you care to read through the lawsuit, you might be ready for an alternative thought—you pay OpenAI to get “smart” responses from ChatGPT, while OpenAI paid nothing to The New York Times to “copy and use” millions of its content. train ChatGPT primarily.
Not only The Times, the lawsuit says, OpenAI “engaged in large-scale copying” of many media organizations, but the NYT’s content was given “particular emphasis.” Interestingly, this claim has not been disputed, and somehow OpenAI accepts this as fact and tried to find a “compromised solution”, which is self-evident that OpenAI must pay or sign a commercial agreement with NYT. However, the negotiations failed, and the agreement has not yet been concluded.
While Microsoft has decided to remain tight-lipped about the lawsuit, OpenAI spokeswoman Lindsey Held said the lawsuit has “surprised and disappointed” OpenAI and that the company was “progressing constructively” toward the deal.
Lindsey Held said in a statement: “We respect the rights of content creators and owners and are committed to working with them to ensure they benefit from A.I. technology and new revenue models,” Held said. “We hope to find a mutually beneficial way to work together, as we do with many with other publishers.”
Frankly, OpenAI accepts the use of free content to train its models and was quick to take subscription money from users, and now that questions are being asked, OpenAI claims to be discovering and working together on a new revenue sharing model. At least that can be the case for big brands like The New York Times. But about small media platforms? And why does anyone even know what content OpenAI has used from which source to train its model? Like Google, OpenAI does not own the content. It just presents information acquired from others over the years and appears to present it as its own.
For the media fraternity, this lawsuit is a reminder of how Google got into trouble with Google News and why it was banned in countries like Spain. Alternatively, why Facebook has to stop the news business.
Creating news networks and getting the right workforce requires a lot of investment from media companies. And if tech companies simply copy and paste content and make money by presenting the same content on their preferred platform, an already strained media industry may soon cease to exist.
In the filings, The Times also cited Microsoft’s Bing search transcripts and categorizes its online content “to rely on responses that contain verbatim excerpts and detailed summaries of Times articles that are significantly longer and more detailed than traditional search engines.”
It added that “using others’ valuable intellectual property in these ways without paying for it has been highly profitable for respondents. Microsoft’s deployment of Times-educated LLMs across its product line helped increase its market value by a trillion dollars in the past year alone. And OpenAI’s ChatGPT release has pushed its value up to $90 billion.”
Another interesting aspect of this lawsuit is that The Times is not asking for an exact amount of money. The lawsuit simply says that OpenAI and Microsoft “should be liable for billions of dollars in statutory and actual damages” for copyright infringement.
While it may appear that the copying of OpenAI content for ChatGPT teaching is about the past and content, what does the lawsuit highlight, what will happen in the future? If readers are getting news and information from AI chatbots that essentially condense content from online media platforms, why do they bother clicking links to get to the actual source? This means lost visitors and ultimately lost advertising revenue.