Stories
Slash Boxes
Comments

SoylentNews is people

SoylentNews is powered by your submissions, so send in your scoop. Only 13 submissions in the queue.
posted by hubie on Tuesday January 21 2025, @09:39AM   Printer-friendly
from the avoiding-the-ouroboros-of-LLM-slop dept.

Blogger Matt Webb point out that nations have begun to need a strategic fact reserve, in light of the problem arising from LLMs and other AI models starting to consume and re-process the slop which they themselves have produced.

The future needs trusted, uncontaminated, complete training data.

From the point of view of national interests, each country (or each trading bloc) will need its own training data, as a reserve, and a hedge against the interests of others.

Probably the best way to start is to take a snapshot of the internet and keep it somewhere really safe. We can sift through it later; the world's data will never be more available or less contaminated than it is today. Like when GitHub stored all public code in an Arctic vault (02/02/2020): a very-long-term archival facility 250 meters deep in the permafrost of an Arctic mountain. Or the Svalbard Global Seed Vault.

But actually I think this is a job for librarians and archivists.

What we need is a long-term national programme to slowly, carefully accept digital data into a read-only archive. We need the expertise of librarians, archivists and museums in the careful and deliberate process of acquisition and accessioning (PDF).

(Look and if this is an excuse for governments to funnel money to the cultural sector then so much the better.)

It should start today.

Already, AI slop is filling the WWW and starting to drown out legitimate, authoritative sources through sheer volume.

Previously
(2025) Meta's AI Profiles Are Already Polluting Instagram and Facebook With Slop
(2024) Thousands Turned Out For Nonexistent Halloween Parade Promoted By AI Listing
(2024) Annoyed Redditors Tanking Google Search Results Illustrates Perils of AI Scrapers


Original Submission

 
This discussion was created by hubie (1068) for logged-in users only, but now has been archived. No new comments can be posted.
Display Options Threshold/Breakthrough Mark All as Read Mark All as Unread
The Fine Print: The following comments are owned by whoever posted them. We are not responsible for them in any way.
  • (Score: 2) by ElizabethGreene on Wednesday January 22 2025, @05:20PM (1 child)

    by ElizabethGreene (6748) on Wednesday January 22 2025, @05:20PM (#1389829) Journal

    The ease of buying ebooks, particularly audiobooks, has indeed reduced my library visits.

    That said, I'm more likely to purchase a physical book now than I used to be. Maybe it's old-fart syndrome, but it's easier for me to sit down and read a physical book than an eBook. That wasn't the case previously, but I've gone headlong into the dopamine addiction trap of notification checking and doomscrolling. Paper seems to help break that loop.

    We're not going to talk about the unread **pile**. :| Nope, nothing to see there. :D

    Starting Score:    1  point
    Karma-Bonus Modifier   +1  

    Total Score:   2  
  • (Score: 2) by aafcac on Wednesday January 22 2025, @05:32PM

    by aafcac (17646) on Wednesday January 22 2025, @05:32PM (#1389832)

    It's kind of unfortunate as around here the libraries have been converted into community centers more than libraries. People do check out tons of books and the library system is one of the most used systems in the country, but it's so incredibly inaccessible for me than back when I was a kid due to just how loud it is. But, in theory, the fact that it's more of a community center probably does bring in people and allow the librarians to guide people to books that are less loved but the book for whatever it is that that patron is looking for.

    That being said, the local library does have a ton of downloadable books and videos, so it's not like they're taking this lying down and to a large extent any reading that people do is probably a good thing.