Why the New York Times Might Win its Copyright Lawsuit Against OpenAI

posted by janrinok on Thursday February 22 2024, @06:46AM

from the data-hoovering dept.

https://arstechnica.com/tech-policy/2024/02/why-the-new-york-times-might-win-its-copyright-lawsuit-against-openai/

The day after The New York Times sued OpenAI for copyright infringement, the author and systems architect Daniel Jeffries wrote an essay-length tweet arguing that the Times "has a near zero probability of winning" its lawsuit. As we write this, it has been retweeted 288 times and received 885,000 views.
"Trying to get everyone to license training data is not going to work because that's not what copyright is about," Jeffries wrote. "Copyright law is about preventing people from producing exact copies or near exact copies of content and posting it for commercial gain. Period. Anyone who tells you otherwise is lying or simply does not understand how copyright works."
[...] Courts are supposed to consider four factors in fair use cases, but two of these factors tend to be the most important. One is the nature of the use. A use is more likely to be fair if it is "transformative"—that is, if the new use has a dramatically different purpose and character from the original. Judge Rakoff dinged MP3.com as non-transformative because songs were merely "being retransmitted in another medium."
In contrast, Google argued that a book search engine is highly transformative because it serves a very different function than an individual book. People read books to enjoy and learn from them. But a search engine is more like a card catalog; it helps people find books.
The other key factor is how a use impacts the market for the original work. Here, too, Google had a strong argument since a book search engine helps people find new books to buy.
[...] In 2015, the Second Circuit ruled for Google. An important theme of the court's opinion is that Google's search engine was giving users factual, uncopyrightable information rather than reproducing much creative expression from the books themselves.
[...] Recently, we visited Stability AI's website and requested an image of a "video game Italian plumber" from its image model Stable Diffusion.
[...] Clearly, these models did not just learn abstract facts about plumbers—for example, that they wear overalls and carry wrenches. They learned facts about a specific fictional Italian plumber who wears white gloves, blue overalls with yellow buttons, and a red hat with an "M" on the front.
These are not facts about the world that lie beyond the reach of copyright. Rather, the creative choices that define Mario are likely covered by copyrights held by Nintendo.

We are not the first to notice this issue. When one of us (Tim) first wrote about these lawsuits last year, he illustrated his story with an image of Mickey Mouse generated by Stable Diffusion. In a January piece for IEEE Spectrum, cognitive scientist Gary Marcus and artist Reid Southen showed that generative image models produce a wide range of potentially infringing images—not only of copyrighted characters from video games and cartoons but near-perfect copies of stills from movies like Black Widow, Avengers: Infinity War, and Batman v Superman.
In its lawsuit against OpenAI, the New York Times provided 100 examples of GPT-4 generating long, near-verbatim excerpts from Times articles
[...] Those who advocate a finding of fair use like to split the analysis into two steps, which you can see in OpenAI's blog post about The New York Times lawsuit. OpenAI first categorically argues that "training AI models using publicly available Internet materials is fair use." Then in a separate section, OpenAI argues that "'regurgitation' is a rare bug that we are working to drive to zero."
But the courts tend to analyze a question like this holistically; the legality of the initial copying depends on details of how the copied data is ultimately used.

Previously on SoylentNews:
New York Times Sues Microsoft, ChatGPT Maker OpenAI Over Copyright Infringement - 20231228
Report: Potential NYT lawsuit could force OpenAI to wipe ChatGPT and start over - 20230821

Original Submission

This discussion was created by janrinok (52) for logged-in users only, but now has been archived. No new comments can be posted.

Why the New York Times Might Win its Copyright Lawsuit Against OpenAI | Log In/Create an Account | Top | 23 comments | Search Discussion

The Fine Print: The following comments are owned by whoever posted them. We are not responsible for them in any way.

SoylentNews

SoylentNews is people

Navigation

Sections

SoylentNews

Log In

Why the New York Times Might Win its Copyright Lawsuit Against OpenAI

Related Stories

New York Times Sues Microsoft, ChatGPT Maker OpenAI Over Copyright Infringement

Image-scraping Midjourney bans rival AI firm for scraping images

ExampleExample (Score: 2) by DrkShadow on Thursday February 22 2024, @07:03AM (5 children)

Re:Example(Score: 4, Interesting) by Anonymous Coward on Thursday February 22 2024, @08:46AM

Re:ExampleRe:Example (Score: 2) by stormreaver on Thursday February 22 2024, @02:19PM (2 children)

Re:ExampleRe:Example (Score: 2) by DrkShadow on Saturday February 24 2024, @05:38AM (1 child)

Re:Example(Score: 2) by stormreaver on Sunday February 25 2024, @12:49AM

Re:Example(Score: 3, Interesting) by Freeman on Thursday February 22 2024, @02:48PM

hype, not AIhype, not AI (Score: 5, Insightful) by bzipitidoo on Thursday February 22 2024, @08:26AM (2 children)

Re:hype, not AIRe:hype, not AI (Score: 0) by Anonymous Coward on Thursday February 22 2024, @08:52AM (1 child)

Re:hype, not AI(Score: 3, Funny) by Anonymous Coward on Thursday February 22 2024, @01:18PM

Hilarious(Score: 5, Funny) by stormreaver on Thursday February 22 2024, @02:21PM

Copyright or trademark? Copyright or trademark? (Score: 4, Interesting) by ElizabethGreene on Thursday February 22 2024, @02:34PM (1 child)

Re: Copyright or trademark?(Score: 2) by Freeman on Thursday February 22 2024, @02:57PM

CircumstancesCircumstances (Score: 5, Insightful) by Freeman on Thursday February 22 2024, @03:01PM (5 children)

Re:CircumstancesRe:Circumstances (Score: 0, Disagree) by Anonymous Coward on Thursday February 22 2024, @08:50PM (4 children)

Re:CircumstancesRe:Circumstances (Score: 3, Interesting) by Freeman on Friday February 23 2024, @02:26PM (3 children)

Re:CircumstancesRe:Circumstances (Score: 0) by Anonymous Coward on Saturday February 24 2024, @06:29AM (2 children)

Re:CircumstancesRe:Circumstances (Score: 2) by Freeman on Monday February 26 2024, @02:23PM (1 child)

Re:Circumstances(Score: 0) by Anonymous Coward on Tuesday February 27 2024, @07:58AM

HUMANS do copyright infringementHUMANS do copyright infringement (Score: 3, Interesting) by DannyB on Thursday February 22 2024, @04:50PM (4 children)

Re:HUMANS do copyright infringementRe:HUMANS do copyright infringement (Score: 2, Touché) by Anonymous Coward on Friday February 23 2024, @02:06AM (1 child)

Re:HUMANS do copyright infringement(Score: 2) by DannyB on Friday February 23 2024, @03:24PM

Re:HUMANS do copyright infringementRe:HUMANS do copyright infringement (Score: 2) by Freeman on Friday February 23 2024, @02:35PM (1 child)

Re:HUMANS do copyright infringement(Score: 2) by DannyB on Friday February 23 2024, @03:21PM

SoylentNews

SoylentNews is people

Navigation

Sections

SoylentNews

Log In

Related Links

Why the New York Times Might Win its Copyright Lawsuit Against OpenAI

Related Stories

New York Times Sues Microsoft, ChatGPT Maker OpenAI Over Copyright Infringement

Image-scraping Midjourney bans rival AI firm for scraping images

ExampleExample (Score: 2) by DrkShadow on Thursday February 22 2024, @07:03AM (5 children)

Re:Example(Score: 4, Interesting) by Anonymous Coward on Thursday February 22 2024, @08:46AM

Re:ExampleRe:Example (Score: 2) by stormreaver on Thursday February 22 2024, @02:19PM (2 children)

Re:ExampleRe:Example (Score: 2) by DrkShadow on Saturday February 24 2024, @05:38AM (1 child)

Re:Example(Score: 2) by stormreaver on Sunday February 25 2024, @12:49AM

Re:Example(Score: 3, Interesting) by Freeman on Thursday February 22 2024, @02:48PM

hype, not AIhype, not AI (Score: 5, Insightful) by bzipitidoo on Thursday February 22 2024, @08:26AM (2 children)

Re:hype, not AIRe:hype, not AI (Score: 0) by Anonymous Coward on Thursday February 22 2024, @08:52AM (1 child)

Re:hype, not AI(Score: 3, Funny) by Anonymous Coward on Thursday February 22 2024, @01:18PM

Hilarious(Score: 5, Funny) by stormreaver on Thursday February 22 2024, @02:21PM

Copyright or trademark? Copyright or trademark? (Score: 4, Interesting) by ElizabethGreene on Thursday February 22 2024, @02:34PM (1 child)

Re: Copyright or trademark?(Score: 2) by Freeman on Thursday February 22 2024, @02:57PM

CircumstancesCircumstances (Score: 5, Insightful) by Freeman on Thursday February 22 2024, @03:01PM (5 children)

Re:CircumstancesRe:Circumstances (Score: 0, Disagree) by Anonymous Coward on Thursday February 22 2024, @08:50PM (4 children)

Re:CircumstancesRe:Circumstances (Score: 3, Interesting) by Freeman on Friday February 23 2024, @02:26PM (3 children)

Re:CircumstancesRe:Circumstances (Score: 0) by Anonymous Coward on Saturday February 24 2024, @06:29AM (2 children)

Re:CircumstancesRe:Circumstances (Score: 2) by Freeman on Monday February 26 2024, @02:23PM (1 child)

Re:Circumstances(Score: 0) by Anonymous Coward on Tuesday February 27 2024, @07:58AM

HUMANS do copyright infringementHUMANS do copyright infringement (Score: 3, Interesting) by DannyB on Thursday February 22 2024, @04:50PM (4 children)

Re:HUMANS do copyright infringementRe:HUMANS do copyright infringement (Score: 2, Touché) by Anonymous Coward on Friday February 23 2024, @02:06AM (1 child)

Re:HUMANS do copyright infringement(Score: 2) by DannyB on Friday February 23 2024, @03:24PM

Re:HUMANS do copyright infringementRe:HUMANS do copyright infringement (Score: 2) by Freeman on Friday February 23 2024, @02:35PM (1 child)

Re:HUMANS do copyright infringement(Score: 2) by DannyB on Friday February 23 2024, @03:21PM