The comedian alleges the companies didn't obtain her consent before using her work to train their LLMs.News 

Sarah Silverman Files Lawsuit Against OpenAI and Meta for Alleged Copyright Infringement

Sarah Silverman, along with authors Christopher Golden and Richard Kadrey, has filed lawsuits against OpenAI and Meta. The legal action claims that these companies utilized copyrighted content, including the works of the plaintiffs, to train their language models without seeking permission. The complaints were filed on Friday and reported by Gizmodo.

The complaints focus on the datasets OpenAI and Meta allegedly used to train ChatGPT and LLaMA. While in OpenAI’s case, the “Books1” data set is roughly the size of Project Gutenberg — a well-known copyright-free book repository — the plaintiffs’ lawyers argue that the “Books2” data sets are too large to have been derived from anything else. than so-called “shadow libraries” of illegally available copyrighted material such as Library Genesis and Sci-Hub. Daily pirates can access these materials via direct downloads, but perhaps more useful for those creating larger language models is that many shadow libraries also make written material available in mass torrent packages. One exhibit from Silverman’s trial includes an exchange between the comedian’s attorneys and ChatGPT. Silverman’s legal team asked the chatbot to summarize The Bedwetter, his 2010 memoir. Not only was the chatbot able to outline entire sections of the book, but some of the passages it conveyed appeared to be reproduced verbatim.

Silverman, Golden and Kadrey are not the first authors to sue OpenAI for copyright infringement. In fact, the company is facing many legal challenges regarding how it conducted the ChatGPT training. In June alone, two separate complaints were submitted to the company. One is a broad class-action lawsuit alleging OpenAI violated federal and state privacy laws by scraping data to train the large language models behind ChatGPT and DALL-E.

Related posts

Leave a Comment