Stories
Slash Boxes
Comments

SoylentNews is people

posted by requerdanos on Wednesday July 12 2023, @09:02AM   Printer-friendly
from the regurgitation dept.

https://arstechnica.com/information-technology/2023/07/book-authors-sue-openai-and-meta-over-text-used-to-train-ai/

On Friday, the Joseph Saveri Law Firm filed US federal class-action lawsuits on behalf of Sarah Silverman and other authors against OpenAI and Meta, accusing the companies of illegally using copyrighted material to train AI language models such as ChatGPT and LLaMA.

Other authors represented include Christopher Golden and Richard Kadrey, and an earlier class-action lawsuit filed by the same firm on June 28 included authors Paul Tremblay and Mona Awad. Each lawsuit alleges violations of the Digital Millennium Copyright Act, unfair competition laws, and negligence.

[...] Authors claim that by utilizing "flagrantly illegal" data sets, OpenAI allegedly infringed copyrights of Silverman's book The Bedwetter, Golden's Ararat, and Kadrey's Sandman Slime. And Meta allegedly infringed copyrights of the same three books, as well as "several" other titles from Golden and Kadrey.

[...] Authors are already upset that companies seem to be unfairly profiting off their copyrighted materials, and the Meta lawsuit noted that any unfair profits currently gained could further balloon, as "Meta plans to make the next version of LLaMA commercially available." In addition to other damages, the authors are asking for restitution of alleged profits lost.

"Much of the material in the training datasets used by OpenAI and Meta comes from copyrighted works—including books written by plain­tiffs—that were copied by OpenAI and Meta without consent, without credit, and without compensation," Saveri and Butterick wrote in their press release.


Original Submission

 
This discussion was created by requerdanos (5997) for logged-in users only, but now has been archived. No new comments can be posted.
Display Options Threshold/Breakthrough Mark All as Read Mark All as Unread
The Fine Print: The following comments are owned by whoever posted them. We are not responsible for them in any way.
  • (Score: 2) by looorg on Wednesday July 12 2023, @11:51AM (11 children)

    by looorg (578) on Wednesday July 12 2023, @11:51AM (#1315679)

    While a bit over the the top perhaps I do think that they have a point. A vital point of collecting data have always been to be able to reference it, as in telling where things came from and to then be able to give credit. That said if her, others, have their books or writings publicly available then they might be barking up the wrong tree. But otherwise you would have to have some kind of licensing deal or just I guess buy a copy of their book(s).

    One would think it would also be some what vital to data health as in knowing where things came from so you can actually remove bad data. Still you don't need to know where it came from to delete it but it would help as you can then remove all data from said source as it was clearly bad.

    But I assume she just doesn't want credit. She also wants to get paid. Which I guess is hard if you are already sharing it all for free. But if she doesn't and their works are somehow already still in there then I guess they got some explaining to do. As they have indeed stolen it.

    For it to be plagiarism they would have to somehow claim credit for it. I don't think they are claiming credit for it. Not that I know of. They are just not giving any credit to anyone. It's just from the big old datablob of content that they claim to not know where it came from. Which as noted previously is bad for so many reasons.

    Starting Score:    1  point
    Karma-Bonus Modifier   +1  

    Total Score:   2  
  • (Score: 4, Insightful) by shrewdsheep on Wednesday July 12 2023, @12:09PM

    by shrewdsheep (5215) on Wednesday July 12 2023, @12:09PM (#1315682)

    ... if [they] have their books or writings publicly available then they might be barking up the wrong tree.

    To be fair, it is incredibly hard to control availability. If I come across something paywalled (paper/news/book) more often than not, a simple DDG will cough it up. I think the onus has to be on the scraper to prove legitimacy.

  • (Score: 4, Insightful) by HiThere on Wednesday July 12 2023, @01:50PM (9 children)

    by HiThere (866) Subscriber Badge on Wednesday July 12 2023, @01:50PM (#1315690) Journal

    Given our idiotic copyright laws I think they should win the case. The real problem is with the copyright laws.

    --
    Javascript is what you use to allow unknown third parties to run software you have no idea about on your computer.
    • (Score: 3, Insightful) by ElizabethGreene on Wednesday July 12 2023, @03:35PM (1 child)

      by ElizabethGreene (6748) Subscriber Badge on Wednesday July 12 2023, @03:35PM (#1315708) Journal

      The real problem is with the copyright laws.

      I couldn't agree more. It's time for a forklift upgrade here. They need to be reasonably time limited in a non-Disneyfied way, clear guidance on what is and is not fair use, and some level of foresight with how new technologies should be treated until such time as congress updates the law.

      • (Score: 4, Insightful) by DeathMonkey on Wednesday July 12 2023, @07:27PM

        by DeathMonkey (1380) on Wednesday July 12 2023, @07:27PM (#1315751) Journal

        Time limited being the key feature here I think.

        If they could just scrape everything older than 20 years and be in clear and objective compliance with copyright law they would be doing it already!

    • (Score: 3, Informative) by Thexalon on Wednesday July 12 2023, @05:08PM (6 children)

      by Thexalon (636) on Wednesday July 12 2023, @05:08PM (#1315728)

      The real problem is that artists and authors and musicians and filmmakers need to eat and have a roof over their head, and the only way they can do so is if somebody has to pay them for their work. That's what copyright was supposed to do for them.

      Of course, Disney et al have worked hard to turn copyright into something that protects Disney and not writers who work for Disney (for example), but there was at least some reasonableness behind the concept once.

      --
      The only thing that stops a bad guy with a compiler is a good guy with a compiler.
      • (Score: 3, Interesting) by pTamok on Wednesday July 12 2023, @06:14PM (1 child)

        by pTamok (3042) on Wednesday July 12 2023, @06:14PM (#1315742)

        Simple.

        Copyrights can only be owned by natural humans, not corporations. Licences can only be non-exclusive.

        • (Score: 3, Insightful) by Joe Desertrat on Thursday July 13 2023, @01:03AM

          by Joe Desertrat (2454) on Thursday July 13 2023, @01:03AM (#1315821)

          Copyrights can only be owned by natural humans, not corporations. Licenses can only be non-exclusive.

          To expand a bit on this, copyrights should only ever be able to be owned by the actual creators, with perhaps a brief period where death or incapacity might allow it to be transferred to an heir. Otherwise, if transferred or sold any copyright is voided. At any rate a copyright should not last for more than 12 years or so. Limited licensing for distribution should be allowed for a short period (3 years? 5 years? 7 years?) with maybe one 3 year renewal being allowed. In every case the original creator should be able to make full personal use of their own copyrighted work.

      • (Score: 3, Funny) by legont on Thursday July 13 2023, @12:52AM (3 children)

        by legont (4179) on Thursday July 13 2023, @12:52AM (#1315817)

        All those artists can make their living by open air live performances where tickets are sold for physical beings to attend.
        As per their creations, they are free as birds for anybody to take.
        Yes, they are not supposed to be rich. Rich makes bad art. Only poor - preferably near to death poor - makes good art.

        --
        "Wealth is the relentless enemy of understanding" - John Kenneth Galbraith.
        • (Score: 3, Touché) by mcgrew on Thursday July 13 2023, @01:10AM (2 children)

          by mcgrew (701) <publish@mcgrewbooks.com> on Thursday July 13 2023, @01:10AM (#1315824) Homepage Journal

          Only poor - preferably near to death poor - makes good art.

          Says the man who knows absolutely nothing about any art form whatever. Yes, I was an art student, kid, half a century ago. Those who think they know everything are annoying to those who know nobody does.

          --
          mcgrewbooks.com mcgrew.info nooze.org
          • (Score: 2) by legont on Wednesday July 19 2023, @02:06AM (1 child)

            by legont (4179) on Wednesday July 19 2023, @02:06AM (#1316773)

            Let me guess - you didn't make any significant art.

            --
            "Wealth is the relentless enemy of understanding" - John Kenneth Galbraith.