Blogger Matt Webb point out that nations have begun to need a strategic fact reserve, in light of the problem arising from LLMs and other AI models starting to consume and re-process the slop which they themselves have produced.
The future needs trusted, uncontaminated, complete training data.
From the point of view of national interests, each country (or each trading bloc) will need its own training data, as a reserve, and a hedge against the interests of others.
Probably the best way to start is to take a snapshot of the internet and keep it somewhere really safe. We can sift through it later; the world's data will never be more available or less contaminated than it is today. Like when GitHub stored all public code in an Arctic vault (02/02/2020): a very-long-term archival facility 250 meters deep in the permafrost of an Arctic mountain. Or the Svalbard Global Seed Vault.
But actually I think this is a job for librarians and archivists.
What we need is a long-term national programme to slowly, carefully accept digital data into a read-only archive. We need the expertise of librarians, archivists and museums in the careful and deliberate process of acquisition and accessioning (PDF).
(Look and if this is an excuse for governments to funnel money to the cultural sector then so much the better.)
It should start today.
Already, AI slop is filling the WWW and starting to drown out legitimate, authoritative sources through sheer volume.
Previously
(2025) Meta's AI Profiles Are Already Polluting Instagram and Facebook With Slop
(2024) Thousands Turned Out For Nonexistent Halloween Parade Promoted By AI Listing
(2024) Annoyed Redditors Tanking Google Search Results Illustrates Perils of AI Scrapers
(Score: 2) by aafcac on Wednesday January 22 2025, @05:37PM (1 child)
That's more or less it, even under the most benevolent and well-intended system with checks and balances that are completely effective at keeping bias out, a lot of things that we view as being facts have changed over the previous few decades in particular. It's not always due to bad actors, a lot of it is because people have pushed the boundaries of research and learned that we were previously wrong. The things that are really and truly incontrovertible tend to be pretty infrequent. I'm not sure what disputing the atomic number of an atom would be, but it tends to be things like that where having a fact reserve is possible, it's also so basic as to be completely pointless as it's the stuff made of atoms that people dispute when it comes to fluoride in the drinking water and various stuff in vaccines.
(Score: 1, Funny) by Anonymous Coward on Thursday January 23 2025, @05:35AM
Bro they put fluorine in the toothpaste now. DON'T BRUSH. It'll make your kids gay.