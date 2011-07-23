Stories
Sarah Silverman Sues OpenAI, Meta for Being "Industrial-Strength Plagiarists"

posted by requerdanos on Wednesday July 12, @09:02AM
from the regurgitation dept.
Freeman writes:

https://arstechnica.com/information-technology/2023/07/book-authors-sue-openai-and-meta-over-text-used-to-train-ai/

On Friday, the Joseph Saveri Law Firm filed US federal class-action lawsuits on behalf of Sarah Silverman and other authors against OpenAI and Meta, accusing the companies of illegally using copyrighted material to train AI language models such as ChatGPT and LLaMA.

Other authors represented include Christopher Golden and Richard Kadrey, and an earlier class-action lawsuit filed by the same firm on June 28 included authors Paul Tremblay and Mona Awad. Each lawsuit alleges violations of the Digital Millennium Copyright Act, unfair competition laws, and negligence.

[...] Authors claim that by utilizing "flagrantly illegal" data sets, OpenAI allegedly infringed copyrights of Silverman's book The Bedwetter, Golden's Ararat, and Kadrey's Sandman Slime. And Meta allegedly infringed copyrights of the same three books, as well as "several" other titles from Golden and Kadrey.

[...] Authors are already upset that companies seem to be unfairly profiting off their copyrighted materials, and the Meta lawsuit noted that any unfair profits currently gained could further balloon, as "Meta plans to make the next version of LLaMA commercially available." In addition to other damages, the authors are asking for restitution of alleged profits lost.

"Much of the material in the training datasets used by OpenAI and Meta comes from copyrighted works—including books written by plain­tiffs—that were copied by OpenAI and Meta without consent, without credit, and without compensation," Saveri and Butterick wrote in their press release.

  • (Score: 2) by sigterm on Wednesday July 12, @10:01AM

    by sigterm (849) on Wednesday July 12, @10:01AM (#1315672)

    I have some doubts about it being illegal to scrape material that's been published on the open Internet.

    But I say Google and Meta should just remove this from their data sets. Sure, I will no longer be able to ask LLaMA or ChatGPT to do stuff like "rewrite the Declaration of Independence in the style of a painfully unfunny comedian," but I can live with that.

(1)