Blogger Matt Webb point out that nations have begun to need a strategic fact reserve, in light of the problem arising from LLMs and other AI models starting to consume and re-process the slop which they themselves have produced.
The future needs trusted, uncontaminated, complete training data.
From the point of view of national interests, each country (or each trading bloc) will need its own training data, as a reserve, and a hedge against the interests of others.
Probably the best way to start is to take a snapshot of the internet and keep it somewhere really safe. We can sift through it later; the world's data will never be more available or less contaminated than it is today. Like when GitHub stored all public code in an Arctic vault (02/02/2020): a very-long-term archival facility 250 meters deep in the permafrost of an Arctic mountain. Or the Svalbard Global Seed Vault.
But actually I think this is a job for librarians and archivists.
What we need is a long-term national programme to slowly, carefully accept digital data into a read-only archive. We need the expertise of librarians, archivists and museums in the careful and deliberate process of acquisition and accessioning (PDF).
(Look and if this is an excuse for governments to funnel money to the cultural sector then so much the better.)
It should start today.
Already, AI slop is filling the WWW and starting to drown out legitimate, authoritative sources through sheer volume.
Previously
(2025) Meta's AI Profiles Are Already Polluting Instagram and Facebook With Slop
(2024) Thousands Turned Out For Nonexistent Halloween Parade Promoted By AI Listing
(2024) Annoyed Redditors Tanking Google Search Results Illustrates Perils of AI Scrapers
(Score: 5, Insightful) by Unixnut on Tuesday January 21 2025, @10:51AM (15 children)
The problem is that I don't think a "strategic fact reserve" will help the situation. Unlike a strategic reserve of a commodity (such as oil) which is fungible and apolitical, what qualifies as a "fact" is very much open to interpretation.
As a result there is no doubt such fact reserves will be "tweaked" to represent whatever the politicians in power prefer, meaning people would automatically distrust such fact reserves, and by extension any LLMs that make use of it.
So your choice would be between "Government approved" LLMs that most likely will be censored and tweaked to say what those in power want to say, or non-government approved LLMs (assuming they don't get banned) that may well be uncensored and free, but could end up producing nonsense due to consuming nonsense as part of their training.
I believe the freedom and decentralisation is one of the great things about the internet, yet everywhere I look people are pushing to centralisation and control, now of what actually constitutes approved "facts" which would feed LLMs people would use for their source of information. It sounds like a serious risk of tyranny by those who control the "facts".
From my side, one potential solution came to me when I read "But actually I think this is a job for librarians and archivists." in TFA. A potential decentralised network of global libraries individually curating and offering information in a standard format for LLMs to subscribe to for training seems like a better solution than one big central "fact reserve" per country (or economic/military/political block).
For one thing it would allow information to be distributed across the globe, secondly it would allow differing viewpoints to be expressed, and LLMs can pick and choose which sources they want to train on. Plus libraries have been dying out due to fewer people reading physical books (or reading in general to be honest), so this may well be a way of allowing them to survive in their local community by giving them a second use beyond a physical repository of knowledge.
(Score: 5, Informative) by canopic jug on Tuesday January 21 2025, @11:10AM (10 children)
Plus libraries have been dying out due to fewer people reading physical books (or reading in general to be honest), [...]
Can't speak for around your country, but around here libraries are required to cull a double-digit portion of their collections annually to make room for "new" acquisitions. However, the selection criteria are neither quality nor relevance but instead only newness. Thus the best of breed and, sometimes, irreplaceable artifacts are sent off to the incinerator. (A few years ago, the region's schools sent students to the libraries for what 10 years ago would have been an easy assignment regarding history. However, almost all the relevant books and other historical materials had already been burned forcing cancellation of the lessons which had to be replaced with other activities. It is a slow-burn (pardon the pun) cultural revolution with the fires being lit by corporations rather than politicians.) Increasingly, the new acquisitions are further restricted to what particular companies are willing to supply.
Also around here, the library heads and one layer of managers below them too have been replaced years ago with political representatives who are their to represent local political interests inside the library rather than vice versa. The result is that nothing related to traditional library activities is happening or is going to happen for the foreseeable future. However, there are a lot of long term friends of local politicians drooling over the real estate which the libraries are occupying (for now).
Kiss those PISA scores goodbye.
tldr; libraries are not dying, they are being killed and those doing so have names and addresses
Money is not free speech. Elections should not be auctions.
(Score: 4, Informative) by looorg on Tuesday January 21 2025, @11:28AM (5 children)
They do something similar here, it's a process -- On shelf, in archive, sold, burned. So it takes a while before the book burning starts. But there is a process to replace old with new. That said I do visit the library, either at the university or city a few times per month. What is noticeable for several decades now is how they are changing. It's not so much about the book and knowledge anymore. The library is about entertainment.
Both libraries I visit now now has a cafe/sallad bar attached to it. They do have an excellent cookie-buffet at the local library.
There is free internet service, multiple computers. Free to borrow.
There is a large selection of music and movie dvd available to borrow. There is also a large selection of computer and console games, I suspect that this one will dwindle tho as they are all physical copies so they can't have things that actually tie to an account on Steam or anything. So that is probably a short lived thing that will get removed "soon".
I mostly go there to either pick up some book or to read papers or magazines. But it would seem like old books are going away, getting replaced by new. There is a large focus on books, comics and graphical novels for children. Even tho most of the visitors are older, or seniors.
Anyway I think it was a long time since the library was actually some kind of eternal repository for knowledge. There are no stern librarians around anymore to shush you if you talk. Which I find odd and annoying, if I sit and read I don't want to hear other idiots talk on their phones.
(Score: 4, Informative) by Unixnut on Tuesday January 21 2025, @12:19PM (1 child)
Round these parts (UK/Europe) I have never heard of books being burned. Generally books go through a cycle. When new books/stock is delivered if there is no space than the old books are split into two:
1. Rare books are kept
2. Common books are disposed of
Then the next step (when another new stock of books come) is the process above is repeated, except the last generation "rare books" get sent to the central library to be archived (and nowadays digitised).
As for the disposal itself, first thing the library will do is offer the books for free to the public, as well as charities and local bookshops. The local bookshops will try to resell the old books, and if they run out of space due to new stock arriving, they will offer them for free (usually they place them in front of the book shop on the public way, you just pick whatever takes your fancy and walk off with it). Finally any books that are not even taken for free are sent to the paper recycling mill, where they get turned into more paper.
The Charities will do similar, either sell them to raise money for their cause, or in some cases provide books to their cause directly (e.g. old peoples homes or hospital wards).
Problem is the libraries round here are stuck in a doom loop of degradation. I don't think my local library has had new stock delivered in 10 years at least. Its a cycle of funding cuts because "nobody reads books anymore and library visits are down on last year" resulting in no funding to buy new stock, resulting in nothing much new for people to read, causing fewer people to visit, which then starts another round of funding cuts.
My local library looks really neglected and the above is the reason the librarians gave, until the libraries are shut down and the buildings sold to property developers for more expensive housing projects (the cynic in me would say this might well be the real reason their funding was cut in the first place).
Having the libraries as a repository of knowledge for LLMs would give them a reason to get more funding, so not only can they store, digitise and curate information, the increased funding could well improve the experience, bring in new stock and entice people to visit, thereby regenerating them and breaking the cycle of stagnation.
(Score: 3, Insightful) by pTamok on Tuesday January 21 2025, @01:19PM
Perhaps the data hoarders will help/save some future generation.
It's important to document what is happening now so that future generations (if they come to exist) have the opportunity to avoid making the same mistakes as us.
(Score: 4, Informative) by pTamok on Tuesday January 21 2025, @02:39PM (2 children)
Fahrenheit 451 [wikipedia.org] (published in 1953).
If you have not read it, or at least seen the film, then maybe you should.
I also recommend The Pedestrian [wikipedia.org] (published in 1951).
(Score: 2) by Thexalon on Tuesday January 21 2025, @05:04PM
That some people outright refuse to read Fahrenheit 451 demonstrates some of the major themes of the story.
"Think of how stupid the average person is. Then realize half of 'em are stupider than that." - George Carlin
(Score: 2) by Beryllium Sphere (r) on Wednesday January 22 2025, @02:05AM
And see if Seashells remind you of anything today.
(Score: 2) by ElizabethGreene on Wednesday January 22 2025, @01:30PM (3 children)
A significant portion of libraries don't buy most of their new release titles. The books are leased from the publisher and go back at the end of the lease. The library controls the lease length based on popularity. These books can often be identified by a sticker with the name of the publisher on the top or bottom of the spine on the dust jacket, below the protective plastic dust jacket cover.
If you want to hear sad librarian noises, ask how many books never get checked out, even once.
Near the Florida/Georgia line there's a bookstore that sells off-lease books. I make a point to stop there whenever we drive through.
(Score: 2) by aafcac on Wednesday January 22 2025, @03:43PM (2 children)
That's pretty much inevitable. They're buying on an estimation of what will be popular. It's probably worse now with fewer people coming in to look and collections that are much larger than they used to be. And the ease of buying ebooks probably not helping.
(Score: 2) by ElizabethGreene on Wednesday January 22 2025, @05:20PM (1 child)
The ease of buying ebooks, particularly audiobooks, has indeed reduced my library visits.
That said, I'm more likely to purchase a physical book now than I used to be. Maybe it's old-fart syndrome, but it's easier for me to sit down and read a physical book than an eBook. That wasn't the case previously, but I've gone headlong into the dopamine addiction trap of notification checking and doomscrolling. Paper seems to help break that loop.
We're not going to talk about the unread **pile**. :| Nope, nothing to see there. :D
(Score: 2) by aafcac on Wednesday January 22 2025, @05:32PM
It's kind of unfortunate as around here the libraries have been converted into community centers more than libraries. People do check out tons of books and the library system is one of the most used systems in the country, but it's so incredibly inaccessible for me than back when I was a kid due to just how loud it is. But, in theory, the fact that it's more of a community center probably does bring in people and allow the librarians to guide people to books that are less loved but the book for whatever it is that that patron is looking for.
That being said, the local library does have a ton of downloadable books and videos, so it's not like they're taking this lying down and to a large extent any reading that people do is probably a good thing.
(Score: 4, Funny) by c0lo on Tuesday January 21 2025, @01:30PM
True. You see, there's... hang on, TikTok is back, up and running again, talk to you later :large-grin:
https://www.youtube.com/@ProfSteveKeen https://soylentnews.org/~MichaelDavidCrawford
(Score: 0) by Anonymous Coward on Tuesday January 21 2025, @05:04PM
One of the worst funded/supported fields of endeavor.
The Internet Archive as an example
Do they even have an endowment?
(Score: 2) by aafcac on Wednesday January 22 2025, @05:37PM (1 child)
That's more or less it, even under the most benevolent and well-intended system with checks and balances that are completely effective at keeping bias out, a lot of things that we view as being facts have changed over the previous few decades in particular. It's not always due to bad actors, a lot of it is because people have pushed the boundaries of research and learned that we were previously wrong. The things that are really and truly incontrovertible tend to be pretty infrequent. I'm not sure what disputing the atomic number of an atom would be, but it tends to be things like that where having a fact reserve is possible, it's also so basic as to be completely pointless as it's the stuff made of atoms that people dispute when it comes to fluoride in the drinking water and various stuff in vaccines.
(Score: 1, Funny) by Anonymous Coward on Thursday January 23 2025, @05:35AM
Bro they put fluorine in the toothpaste now. DON'T BRUSH. It'll make your kids gay.