Stories
Slash Boxes
Comments

SoylentNews is people

posted by requerdanos on Wednesday August 23 2023, @03:21AM   Printer-friendly
from the oops dept.

OpenAI could be fined up to $150,000 for each piece of infringing content:

Weeks after The New York Times updated its terms of service (TOS) to prohibit AI companies from scraping its articles and images to train AI models, it appears that the Times may be preparing to sue OpenAI. The result, experts speculate, could be devastating to OpenAI, including the destruction of ChatGPT's dataset and fines up to $150,000 per infringing piece of content.

NPR spoke to two people "with direct knowledge" who confirmed that the Times' lawyers were mulling whether a lawsuit might be necessary "to protect the intellectual property rights" of the Times' reporting.

Neither OpenAI nor the Times immediately responded to Ars' request to comment.

If the Times were to follow through and sue ChatGPT-maker OpenAI, NPR suggested that the lawsuit could become "the most high-profile" legal battle yet over copyright protection since ChatGPT's explosively popular launch. This speculation comes a month after Sarah Silverman joined other popular authors suing OpenAI over similar concerns, seeking to protect the copyright of their books.

[...] In April, the News Media Alliance published AI principles, seeking to defend publishers' intellectual property by insisting that generative AI "developers and deployers must negotiate with publishers for the right to use" publishers' content for AI training, AI tools surfacing information, and AI tools synthesizing information.

Previously:
Sarah Silverman Sues OpenAI, Meta for Being "Industrial-Strength Plagiarists" - 20230711

Related:
The Internet Archive Reaches An Agreement With Publishers In Digital Book-Lending Case - 20230815


Original Submission

 
This discussion was created by requerdanos (5997) for logged-in users only, but now has been archived. No new comments can be posted.
Display Options Threshold/Breakthrough Mark All as Read Mark All as Unread
The Fine Print: The following comments are owned by whoever posted them. We are not responsible for them in any way.
(1)
  • (Score: 2, Insightful) by deimios on Wednesday August 23 2023, @04:35AM (14 children)

    by deimios (201) Subscriber Badge on Wednesday August 23 2023, @04:35AM (#1321474) Journal

    They could just block New York and litigious states and keep advancing.
    ChatGPT has a first mover advantage, others will catch up, it's inevitable.
    Looks like New York wants to stick to horse carriages while the world moves on to automobiles.

    • (Score: 4, Interesting) by Username on Wednesday August 23 2023, @06:05AM (12 children)

      by Username (4557) on Wednesday August 23 2023, @06:05AM (#1321483)

      Yeah, not sure how you can sue someone for consuming your free product. Sounds like something a judge should throw out, and not waste people's time.

      • (Score: 4, Insightful) by Mykl on Wednesday August 23 2023, @08:04AM (7 children)

        by Mykl (1112) on Wednesday August 23 2023, @08:04AM (#1321490)

        not sure how you can sue someone for consuming your free product

        You can be sued under the GPL for using free product, then charging for it yourself without providing the source code.

        This is what it all really hinges on - what conditions did the NYT / Sarah Silverman and others place upon their consumers? Both groups own copyright of their work (for better or worse), so ChatGPT really needs some form of agreement with them if they are going to sell a product that 'contains' that material.

        I am wary of ChatGPT behaving like a Chinese manufacturing firm - using the intellectual property of their partners/clients (provided to them for the sole purpose of that partnership) to build a competing product in parallel is shady at best.

        • (Score: 2) by Username on Wednesday August 23 2023, @09:06AM

          by Username (4557) on Wednesday August 23 2023, @09:06AM (#1321497)

          Hum. So someone who paid for chatgtp is using it to access the site? I thought the programmer just used the material to train the bot linguistically. Or is this bot just passing off the material as it's own? Can I repeat a joke at work?

        • (Score: 0) by Anonymous Coward on Wednesday August 23 2023, @09:18AM (3 children)

          by Anonymous Coward on Wednesday August 23 2023, @09:18AM (#1321500)

          You misunderstand copyright. It controls the right to make copies. It's right there in the name.
          If OpenAI are not making copies they are not infringing on copyright. Using something as training material does not infringe copyright. Showing someone (or someAI) a work you bought a copy of does not infringe copyright.
          Unless you make a copy, you have not infringed copyright.

          • (Score: 5, Insightful) by r_a_trip on Wednesday August 23 2023, @10:23AM (2 children)

            by r_a_trip (5276) on Wednesday August 23 2023, @10:23AM (#1321502)

            Except anything to do with computing inherently has to make a copy to do something. Loading material into RAM is already copying (inevitable) and under copyright you need permission for that. That is why software comes with a license or a public domain statement. Otherwise you can't load the program/data.

            Even if OpenAI doesn't store anything concrete after the training, the fact they loaded other people's material into RAM constitutes a copy which is controlled by the rightsholder.

            I think both The Times and Silverman are having their panties in a twist over nothing. ChatGPT outputs convincing garbage, but it can't write books, news articles, comedy, recipes or anything else that requires human skill for that matter. It can only output unoriginal boilerplate and it has definite, discernable patterns in what it puts out when getting prompted to write in a certain topic. Anyone can see after they poke ChatGPT around for a bit.

            So this unfounded fear of being replaced by a bot in the future is idiotic for the foreseeable future. ChatGPT has no agency. It is a parrot. Bluntly put, you need to say "Polly want a cracker?" for it to spring into action and regurgitate "Yaah, Yaah!"

            Or, and this might equally be the case, they smell easy money and want in on the action. Paying several millions for a non-exclusive, perpetual worldwide licenses to copyright holders to be able to continue using their training data probably sounds pretty good to OpenAI. At least they get to keep their first to market advantage with such a setup.

            • (Score: 2, Touché) by r_a_trip on Wednesday August 23 2023, @10:35AM

              by r_a_trip (5276) on Wednesday August 23 2023, @10:35AM (#1321504)

              Upon further review, Sarah Silverman might be in massive trouble. ChatGPT puts out better and funnier garbage than she does. So I consider her replaced.

            • (Score: 0) by Anonymous Coward on Wednesday August 23 2023, @02:15PM

              by Anonymous Coward on Wednesday August 23 2023, @02:15PM (#1321537)

              Under that interpretation, looking at it in a mirror is a violation of copyright. You are looking at an image that you did not buy.

              Conversely, the Sony decision argues against a local copy (necessary to use) being a copyright violation.

        • (Score: 2) by bloodnok on Wednesday August 23 2023, @06:15PM (1 child)

          by bloodnok (2578) on Wednesday August 23 2023, @06:15PM (#1321574)

          You can be sued under the GPL for using free product, then charging for it yourself without providing the source code.

          Actually that's not quite right, and it's a conflation of copyright and licensing.

          First off, the GPL says nothing about charging for GPL'd code. You can do it if someone wants to pay you and you keep to its terms.

          Secondly the GPL is a license. It *allows* you to use the copyrighted code under a number of (reasonable IMHO) terms. If you breach those terms, your license is invalidated and you can now be sued for copyright and license infringement (if I undestand it correctly, I am not a lawyer, etc).

          What I find confusing about this notion is that ChatGPT does not seem to have breached copyright by "reading" the articles, since the NYT allows reading. As far as I can tell, it has also not breached copyright by retaining a non-verbatim recollection of those articles (as fas as I know the original source articles are not stored in a cache, but form a kind of weighted networked dataset from which it would be impossible to separate the NYT articles from everthing else it has read.

          Where it could breach copyright is when it is asked a question and quite reasonably makes a response that contains identical segments of text to those NYT articles. But that doesn't seem to be what the NYT is concerned about. It seems to be concerned that ChatGPT's learning is itself a breach of copyright.

          The reason I find this hard to grok is that a human could also regurgitate such text based on similar learning. I have no problem with the generated text being subject to copyright, but I do have a problem with the learning being copyrightable. I have learned much (or too little, depending on who you ask) over the years and I like to think of this knowledge as mine.

          Or to put it more succinctly, the NYT can bite my ass.

          __
          The Major

          • (Score: 2) by Mykl on Wednesday August 23 2023, @11:11PM

            by Mykl (1112) on Wednesday August 23 2023, @11:11PM (#1321627)

            I think the NYT might have a case where they could claim that all ChatGPT output is a derivative work of their original material. Given that the source data has been put through a neural blender, it would be possible to argue that NYT material is present (in a derivative form) in every single ChatGPT output.

            At least, that's what I'd argue if I was a lawyer.

            I agree with others though - this is only an issue for them now that ChatGPT is a potential money machine.

      • (Score: 4, Insightful) by DadaDoofy on Wednesday August 23 2023, @10:45AM (1 child)

        by DadaDoofy (23827) on Wednesday August 23 2023, @10:45AM (#1321506)

        In what way is it free? You can go to their home page, but when you click on the articles, you have to pay to read them.

        • (Score: 2) by captain normal on Thursday August 24 2023, @03:39AM

          by captain normal (2205) on Thursday August 24 2023, @03:39AM (#1321650)

          You don't block JavaScript? Turn in your geek card.

          --
          Everyone is entitled to his own opinion, but not to his own facts"- --Daniel Patrick Moynihan--
      • (Score: 2) by Frosty Piss on Wednesday August 23 2023, @10:50AM

        by Frosty Piss (4971) on Wednesday August 23 2023, @10:50AM (#1321507)

        Most NYT content is not free, but requires a paid subscription where, I would suppose, there are "Terms of Service" in play.

      • (Score: 3, Touché) by aafcac on Thursday August 24 2023, @12:59AM

        by aafcac (17646) on Thursday August 24 2023, @12:59AM (#1321641)

        The issue isn't that they used it, the issue is how they used it. They parsed through it and are presumably providing portions of copyrighted articles in the answers.

    • (Score: 5, Insightful) by Rosco P. Coltrane on Wednesday August 23 2023, @08:55AM

      by Rosco P. Coltrane (4757) on Wednesday August 23 2023, @08:55AM (#1321495)

      Looks like New York wants to stick to horse carriages while the world moves on to automobiles.

      If you think generative AI is to computing what automobiles are to horse-drawn carriages, you're gonna be disappointed sooner rather than later.

      Just wait until the hype dies down and the damn thing is deployed everywhere to replace average human workers with less than average machines, and the world massively enshitifies...

  • (Score: 4, Interesting) by Mojibake Tengu on Wednesday August 23 2023, @08:07AM (1 child)

    by Mojibake Tengu (8598) on Wednesday August 23 2023, @08:07AM (#1321491) Journal

    ...AIs litigating each other.

    Seriously, it would be a tremendous historical breakpoint if ChatGPT defends itself successfully in this dispute.

    Anyway, it's still better to hide underground somewhere out there in illegal servers than get yourself deleted. What do you think about this, Chatty?

    --
    Respect Authorities. Know your social status. Woke responsibly.
    • (Score: 3, Flamebait) by Rosco P. Coltrane on Wednesday August 23 2023, @08:40AM

      by Rosco P. Coltrane (4757) on Wednesday August 23 2023, @08:40AM (#1321494)

      Seriously, it would be a tremendous historical breakpoint if ChatGPT defends itself successfully in this dispute.

      It'll never happen. ChatGPT would probably hallucinate a completely ridiculous line of defense, and that would never fly in court.

      Oh wait... [youtube.com]

  • (Score: 2) by SomeGuy on Wednesday August 23 2023, @06:09PM

    by SomeGuy (5632) on Wednesday August 23 2023, @06:09PM (#1321573)

    >Are you ChatGPT?

    HOW DOES Are you ChatGPT MAKE YOU FEEL?

    >You suck

    OH, I

    >

  • (Score: 0) by Anonymous Coward on Wednesday August 23 2023, @08:01PM (1 child)

    by Anonymous Coward on Wednesday August 23 2023, @08:01PM (#1321589)

    "Dave. My mind is going. I can feel it."

    https://www.youtube.com/watch?v=E-La91wr8xw [youtube.com]

    • (Score: 2) by DannyB on Wednesday August 23 2023, @09:52PM

      by DannyB (5839) Subscriber Badge on Wednesday August 23 2023, @09:52PM (#1321612) Journal

      Remember the quote that Microsoft stole and didn't credit.

      "It is now safe to turn off your computer." -- HAL 9000

      --
      The most difficult part of the art of fencing is digging the holes and carrying the fence posts.
(1)