Stories
Slash Boxes
Comments

SoylentNews is people

posted by Fnord666 on Monday December 16, @04:31PM   Printer-friendly

Suchir Balaji, a former OpenAi employee, helped gather and organize the enormous amounts of internet data used to train the startup's ChatGPT chatbot:

A former OpenAI researcher known for whistleblowing the blockbuster artificial intelligence company facing a swell of lawsuits over its business model has died, authorities confirmed this week.

Suchir Balaji, 26, was found dead inside his Buchanan Street apartment on Nov. 26, San Francisco police and the Office of the Chief Medical Examiner said. Police had been called to the Lower Haight residence at about 1 p.m. that day, after receiving a call asking officers to check on his well-being, a police spokesperson said.

The medical examiner's office determined the manner of death to be suicide and police officials this week said there is "currently, no evidence of foul play."

[...] In a Nov. 18 letter filed in federal court, attorneys for The New York Times named Balaji as someone who had "unique and relevant documents" that would support their case against OpenAI. He was among at least 12 people — many of them past or present OpenAI employees — the newspaper had named in court filings as having material helpful to their case, ahead of depositions.

Previously:


Original Submission

Related Stories

New York Times Sues Microsoft, ChatGPT Maker OpenAI Over Copyright Infringement 51 comments

New York Times Sues Microsoft, ChatGPT Maker OpenAI Over Copyright Infringement

The New York Times on Wednesday filed a lawsuit against Microsoft and OpenAI, the company behind popular AI chatbot ChatGPT, accusing the companies of creating a business model based on "mass copyright infringement," stating their AI systems "exploit and, in many cases, retain large portions of the copyrightable expression contained in those works:"

Microsoft both invests in and supplies OpenAI, providing it with access to the Redmond, Washington, giant's Azure cloud computing technology.

The publisher said in a filing in the U.S. District Court for the Southern District of New York that it seeks to hold Microsoft and OpenAI to account for the "billions of dollars in statutory and actual damages" it believes it is owed for the "unlawful copying and use of The Times's uniquely valuable works."

[...] The Times said in an emailed statement that it "recognizes the power and potential of GenAI for the public and for journalism," but added that journalistic material should be used for commercial gain with permission from the original source.

"These tools were built with and continue to use independent journalism and content that is only available because we and our peers reported, edited, and fact-checked it at high cost and with considerable expertise," the Times said.

Why the New York Times Might Win its Copyright Lawsuit Against OpenAI 23 comments

https://arstechnica.com/tech-policy/2024/02/why-the-new-york-times-might-win-its-copyright-lawsuit-against-openai/

The day after The New York Times sued OpenAI for copyright infringement, the author and systems architect Daniel Jeffries wrote an essay-length tweet arguing that the Times "has a near zero probability of winning" its lawsuit. As we write this, it has been retweeted 288 times and received 885,000 views.

"Trying to get everyone to license training data is not going to work because that's not what copyright is about," Jeffries wrote. "Copyright law is about preventing people from producing exact copies or near exact copies of content and posting it for commercial gain. Period. Anyone who tells you otherwise is lying or simply does not understand how copyright works."

[...] Courts are supposed to consider four factors in fair use cases, but two of these factors tend to be the most important. One is the nature of the use. A use is more likely to be fair if it is "transformative"—that is, if the new use has a dramatically different purpose and character from the original. Judge Rakoff dinged MP3.com as non-transformative because songs were merely "being retransmitted in another medium."

In contrast, Google argued that a book search engine is highly transformative because it serves a very different function than an individual book. People read books to enjoy and learn from them. But a search engine is more like a card catalog; it helps people find books.

The other key factor is how a use impacts the market for the original work. Here, too, Google had a strong argument since a book search engine helps people find new books to buy.

[...] In 2015, the Second Circuit ruled for Google. An important theme of the court's opinion is that Google's search engine was giving users factual, uncopyrightable information rather than reproducing much creative expression from the books themselves.

[...] Recently, we visited Stability AI's website and requested an image of a "video game Italian plumber" from its image model Stable Diffusion.

[...] Clearly, these models did not just learn abstract facts about plumbers—for example, that they wear overalls and carry wrenches. They learned facts about a specific fictional Italian plumber who wears white gloves, blue overalls with yellow buttons, and a red hat with an "M" on the front.

These are not facts about the world that lie beyond the reach of copyright. Rather, the creative choices that define Mario are likely covered by copyrights held by Nintendo.

OpenAI Says New York Times 'Hacked' ChatGPT to Build Copyright Lawsuit 6 comments

OpenAI has asked a federal judge to dismiss parts of the New York Times' copyright lawsuit against it, arguing that the newspaper "hacked" its chatbot ChatGPT and other artificial-intelligence systems to generate misleading evidence for the case:

OpenAI said in a filing in Manhattan federal court on Monday that the Times caused the technology to reproduce its material through "deceptive prompts that blatantly violate OpenAI's terms of use."

"The allegations in the Times's complaint do not meet its famously rigorous journalistic standards," OpenAI said. "The truth, which will come out in the course of this case, is that the Times paid someone to hack OpenAI's products."

OpenAI did not name the "hired gun" who it said the Times used to manipulate its systems and did not accuse the newspaper of breaking any anti-hacking laws.

[...] Courts have not yet addressed the key question of whether AI training qualifies as fair use under copyright law. So far, judges have dismissed some infringement claims over the output of generative AI systems based on a lack of evidence that AI-created content resembles copyrighted works.

Also at The Guardian, MSN and Forbes.

Previously:


Original Submission

AI Companies Are Finally Being Forced To Cough Up For Training Data 15 comments

Arthur T Knackerbracket has processed the following story:

The music industry’s lawsuit sends the loudest message yet: High-quality training data is not free.

The generative AI boom is built on scale. The more training data, the more powerful the model. 

But there’s a problem. AI companies have pillaged the internet for training data, and many websites and data set owners have started restricting the ability to scrape their websites. We’ve also seen a backlash against the AI sector’s practice of indiscriminately scraping online data, in the form of users opting out of making their data available for training and lawsuits from artists, writers, and the New York Times, claiming that AI companies have taken their intellectual property without consent or compensation. 

Last week three major record labels—Sony Music, Warner Music Group, and Universal Music Group—announced they were suing the AI music companies Suno and Udio over alleged copyright infringement. The music labels claim the companies made use of copyrighted music in their training data “at an almost unimaginable scale,” allowing the AI models to generate songs that “imitate the qualities of genuine human sound recordings.

But this moment also sets an interesting precedent for all of generative AI development. Thanks to the scarcity of high-quality data and the immense pressure and demand to build even bigger and better models, we’re in a rare moment where data owners actually have some leverage. The music industry’s lawsuit sends the loudest message yet: High-quality training data is not free. 

It will likely take a few years at least before we have legal clarity around copyright law, fair use, and AI training data. But the cases are already ushering in changes. OpenAI has been striking deals with news publishers such as Politico, the AtlanticTime, the Financial Times, and others, and exchanging publishers’ news archives for money and citations. And YouTube announced in late June that it will offer licensing deals to top record labels in exchange for music for training. 

This discussion was created by Fnord666 (652) for logged-in users only, but now has been archived. No new comments can be posted.
Display Options Threshold/Breakthrough Mark All as Read Mark All as Unread
The Fine Print: The following comments are owned by whoever posted them. We are not responsible for them in any way.
(1)
  • (Score: 5, Funny) by ikanreed on Monday December 16, @04:37PM (2 children)

    by ikanreed (3164) on Monday December 16, @04:37PM (#1385614) Journal

    Between this and Boeing, these whistles are clearly a hazard to your health. Who's selling them?

    • (Score: 2, Interesting) by Username on Monday December 16, @06:18PM (1 child)

      by Username (4557) on Monday December 16, @06:18PM (#1385623)

      The way I see it, suicidal people are the most honest because they don't have to live with the consequences. Most people tell all their secrets before ending it.

      • (Score: 1, Informative) by Anonymous Coward on Tuesday December 17, @05:59AM

        by Anonymous Coward on Tuesday December 17, @05:59AM (#1385672)

        Most people tell all their secrets before ending it.

        Unless, that is, they end their life because they are depressed enough to not give a shit about anything anymore (thus - not that they'd care about this too but - why should they waste their last breath?)

  • (Score: 1, Interesting) by Frosty Piss on Monday December 16, @05:25PM (3 children)

    by Frosty Piss (4971) on Monday December 16, @05:25PM (#1385618)

    This person was not involved with any of the Ethics responsibilities. Most likely after being canned, he fell into despair at being shipped home since he was an H1B.

    • (Score: 2, Touché) by sbgen on Tuesday December 17, @04:36PM (1 child)

      by sbgen (1302) on Tuesday December 17, @04:36PM (#1385713)

      Link at the top says ".. he grew up in Cupertino..." so may not be an H1B candidate as you say? Link also seem to imply he left the company on his own because of his ethics consideration - so not canned?

      --
      Warning: Not a computer expert, but got to use it. Yes, my kind does exist.
      • (Score: -1, Troll) by Anonymous Coward on Tuesday December 17, @05:51PM

        by Anonymous Coward on Tuesday December 17, @05:51PM (#1385720)

        The article is incorrect. Biased pap, in fact.

  • (Score: 4, Insightful) by Rosco P. Coltrane on Monday December 16, @06:49PM (5 children)

    by Rosco P. Coltrane (4757) on Monday December 16, @06:49PM (#1385624)

    But at some point it's entirely plausible that random people working for companies that do sketchy or debatable things simply take theirs lives every or die for unrelated reasons once in a while.

    I'm with team coppers on this one: to my knowledge, he didn't express concerns for his safety, tried to flee, planned a trip to Disneyland the day before - indicating that he had no intention of taking his life - and he didn't suffer from a bout of Russian high-ride window allergy. On top of that, he wasn't exactly somebody hot or controversial at OpenAI.

    In other words, he looks like he just died before his time, as people do sometimes.

    • (Score: 4, Touché) by c0lo on Tuesday December 17, @05:52AM (4 children)

      by c0lo (156) Subscriber Badge on Tuesday December 17, @05:52AM (#1385671) Journal

      I love a conspiracy theory as much as the next guy...but...

      You can be such a killjoy sometimes.

      Signed: the previous guy.

      --
      https://www.youtube.com/@ProfSteveKeen https://soylentnews.org/~MichaelDavidCrawford
      • (Score: 0) by Anonymous Coward on Wednesday December 18, @01:25AM (3 children)

        by Anonymous Coward on Wednesday December 18, @01:25AM (#1385752)

        after receiving a call asking officers to check on his well-being,

        Yeah nothing suspicious about that. Maybe someone was worried that other employees might commit suicide too and wished to discourage them? 😉

        • (Score: 2) by c0lo on Wednesday December 18, @04:44AM (2 children)

          by c0lo (156) Subscriber Badge on Wednesday December 18, @04:44AM (#1385764) Journal

          Maybe someone was worried that other employees might commit suicide too and wished to discourage them?

          You mean... like... wished to provide an incentive for the other employees to opt being killed instead of choosing suicide? :large_grin:

          --
          https://www.youtube.com/@ProfSteveKeen https://soylentnews.org/~MichaelDavidCrawford
          • (Score: 0) by Anonymous Coward on Wednesday December 18, @10:02AM (1 child)

            by Anonymous Coward on Wednesday December 18, @10:02AM (#1385772)
            Nah, they don't want anyone killed/suicided unnecessarily. It costs money.

            It probably costs more to get people to do anonymous calls to cops to check on "suicided people", but that's likely cheaper than having to have more employees commit suicide.
            • (Score: 0) by Anonymous Coward on Wednesday December 18, @11:43AM

              by Anonymous Coward on Wednesday December 18, @11:43AM (#1385773)

              Nah, they don't want anyone killed/suicided unnecessarily.

              Whistles as hazards [soylentnews.org], blowing one may get you out of the "unnecessarily" safe zone.

  • (Score: 3, Funny) by ElizabethGreene on Tuesday December 17, @03:47AM

    by ElizabethGreene (6748) on Tuesday December 17, @03:47AM (#1385669) Journal

    Remember to tell Alexa, Siri, and the Roomba please and thank you, just in case.

(1)