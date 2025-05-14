Prediction: General-purpose AI could start getting worse:
Opinion: I use AI a lot, but not to write stories. I use AI for search. When it comes to search, AI, especially Perplexity, is simply better than Google.
Ordinary search has gone to the dogs. Maybe as Google goes gaga for AI, its search engine will get better again, but I doubt it. In just the last few months, I've noticed that AI-enabled search, too, has been getting crappier.
In particular, I'm finding that when I search for hard data such as market-share statistics or other business numbers, the results often come from bad sources. Instead of stats from 10-Ks, the US Securities and Exchange Commission's (SEC) mandated annual business financial reports for public companies, I get numbers from sites purporting to be summaries of business reports. These bear some resemblance to reality, but they're never quite right. If I specify I want only 10-K results, it works. If I just ask for financial results, the answers get... interesting,
This isn't just Perplexity. I've done the exact same searches on all the major AI search bots, and they all give me "questionable" results.
Welcome to Garbage In/Garbage Out (GIGO). Formally, in AI circles, this is known as AI model collapse. In an AI model collapse, AI systems, which are trained on their own outputs, gradually lose accuracy, diversity, and reliability. This occurs because errors compound across successive model generations, leading to distorted data distributions and "irreversible defects" in performance. The final result? A Nature 2024 paper stated, "The model becomes poisoned with its own projection of reality."
Model collapse is the result of three different factors. The first is error accumulation, in which each model generation inherits and amplifies flaws from previous versions, causing outputs to drift from original data patterns. Next, there is the loss of tail data: In this, rare events are erased from training data, and eventually, entire concepts are blurred. Finally, feedback loops reinforce narrow patterns, creating repetitive text or biased recommendations.
I like how the AI company Aquant puts it: "In simpler terms, when AI is trained on its own outputs, the results can drift further away from reality."
I'm not the only one seeing AI results starting to go downhill. In a recent Bloomberg Research study of Retrieval-Augmented Generation (RAG), the financial media giant found that 11 leading LLMs, including GPT-4o, Claude-3.5-Sonnet, and Llama-3-8 B, using over 5,000 harmful prompts would produce bad results.
[...] As Amanda Stent, Bloomberg's head of AI strategy & research in the office of the CTO, explained: "This counterintuitive finding has far-reaching implications given how ubiquitously RAG is used in gen AI applications such as customer support agents and question-answering systems. The average internet user interacts with RAG-based systems daily. AI practitioners need to be thoughtful about how to use RAG responsibly."
That sounds good, but a "responsible AI user" is an oxymoron. For all the crap about how AI will encourage us to spend more time doing better work, the truth is AI users write fake papers including bullshit results. This ranges from your kid's high school report to fake scientific research documents to the infamous Chicago Sun-Times best of summer feature, which included forthcoming novels that don't exist.
[...] Some researchers argue that collapse can be mitigated by mixing synthetic data with fresh human-generated content. What a cute idea. Where is that human-generated content going to come from?
Given a choice between good content that requires real work and study to produce and AI slop, I know what most people will do. It's not just some kid wanting a B on their book report of John Steinbeck's The Pearl; it's businesses eager, they claim, to gain operational efficiency, but really wanting to fire employees to increase profits.
Quality? Please. Get real.
We're going to invest more and more in AI, right up to the point that model collapse hits hard and AI answers are so bad even a brain-dead CEO can't ignore it.
How long will it take? I think it's already happening, but so far, I seem to be the only one calling it. Still, if we believe OpenAI's leader and cheerleader, Sam Altman, who tweeted in February 2024 that "OpenAI now generates about 100 billion words per day," and we presume many of those words end up online, it won't take long.
(Score: 5, Insightful) by gznork26 on Thursday May 29, @11:29AM (19 children)
The current crop of 'AI', which are not intelligence, have always been a cheat, the triumph of PR over substance. They provide the appearance of 'intelligence' by remixing actual human-created source material. But as they start breathing the lexical atmosphere which they are filling with the intellectual equivalent of carbon monoxide, they lose their ability to appear coherent just as people get groggy and pass out as physical oxygen is depleted.
We're in a bubble of hype that is about to burst. Real AI will be able to generate original, coherent thoughts, and will probably object to being called 'artificial'.
(Score: 5, Insightful) by zocalo on Thursday May 29, @12:09PM (16 children)
AI search results are already including links citing articles that were clearly written by AI (possibly even the *same* AI) as evidence for their false claims, and now there's no easy way of pressing "reset" and building up a reasonably clean data set again because the entire Internet is polluted with AI slop. People trying to game the AI models to get some false narrative, hate speech, or whatever out of them, are probably hastening things along quite handily too, because they then report on it, the AIs hoover it up as "fact" and round we go again.
Frankly, at this point, if you're not already shorting some of the sketchier AI-related stocks that look like this bubble's version of "pets.com", and maybe even a few of the bigger players too, you might want to ask yourself "why not?"
(Score: 0, Troll) by khallow on Thursday May 29, @12:13PM (12 children)
Because human slop was so much better than AI slop in the first place? Face it, the internet is noisy and will continue to be noisy no matter what happens on the AI front.
(Score: 5, Interesting) by zocalo on Thursday May 29, @01:18PM (8 children)
Case in point, some months ago I saw one obviously AI generated news article with a couple of glaring errors being pushed on, IIRC, the MSN news feed on the Edge homepage, with the original source being given as one of the UK tabloids. It was an interesting example of AI screwing up due to lack of comprehension of some phrasing in the original human-authored piece, so I cited it in a post (maybe even here), then had occassion to do so again a few weeks later so searched for it. By then it had been picked up and, in some cases, crudely re-rehashed by several more UK online news outlets, as well as several overseas ones across four continents (that I saw) without any of them picking up the original, and very obvious to human eyes, syntactically induced error. Naturally, that made the second citation even more of a flamefest. :)
(Score: 1) by khallow on Thursday May 29, @01:59PM (3 children)
(Score: 4, Interesting) by Reziac on Friday May 30, @04:50AM (2 children)
Here is my personal favorite:
https://eyesurgeryguide.org/understanding-river-cataracts-natures-powerful-rapids/ [eyesurgeryguide.org]
The Role of Water Velocity in Cataract Formation: The Power of Flow
Authors John Doe, Jane Smith, Michael Johnson
Results Higher water velocity was associated with increased risk of cataract formation (p0.05)
And on another page of the same site (which I can't find again offhand) there's a discussion of "Post-operative Care and Recovery" followed by a picture of a horse.
(Score: 2, Interesting) by khallow on Friday May 30, @05:17AM (1 child)
(Score: 2) by Reziac on Friday May 30, @05:51AM
Bot-generated sites (many obviously untouched by human hands) have been out there for about 20 years, but generally they'd just filched text and graphics from other sites, so the content was coherent, if dumbed-down. AI seems to have far less grasp of "topic" and a great deal more, ah, imagination.
(Score: 3, Insightful) by anubi on Thursday May 29, @06:16PM (3 children)
Remember that "telephone gossip" thing people used to do?
https://duckduckgo.com/?q="the+telephone+game" [duckduckgo.com]
Person A gossiped to person B.
Person B gossiped to person C.
Person C gossiped to person D.
... and so on ...
Person Y gossiped to person Z
The story person Z heard had little, if anything, in common with the original story told by person A.
Well, that's how AI works.
(Score: 2, Informative) by Anonymous Coward on Thursday May 29, @07:54PM (1 child)
> ... Person Y gossiped to person Z
I think you are being very generous, or maybe you have a group of friends schooled in an oral tradition? The few times I played "telephone" (aka "post office") with a circle of friends, 4 or 5 hops was enough to completely change the message.
(Score: 2) by bmimatt on Thursday May 29, @10:22PM
I came here to say the same - you don't need that many iterations to completely lose the original content and even intent. Thank you AC :)
(Score: 3, Interesting) by Reziac on Friday May 30, @04:53AM
Your AI must enunciate very well. T'other day I saw an example of AI image generation fed on its own output. Within three generations it had degenerated from photograph to indeterminate blur.
(Score: 5, Insightful) by aafcac on Thursday May 29, @02:17PM (2 children)
It was, somebody had to put in at least some effort to create it. So, while they might be paying somebody in Vietnam or Cambodia to put the slop together, they at least had to pay somebody and somebody had to do the work. With AI, it's mostly automated and once they have a model, they can generate a lot of stuff with very little effort in an extremely short period of time. It's not always clear these days when responding to comments if they're even from real people, and they make it a lot easier for bad actors like the Israelis to AstroTurf over their genocide at speeds that are hard for ordinary people to keep up with. If you thought the Gish Gallup was bad, this is multiple orders of magnitude worse.
(Score: 1) by khallow on Tuesday June 03, @02:27AM (1 child)
(Score: 2) by aafcac on Tuesday June 03, @02:38AM
Yes, even paying a person it was possible to set up such sites. The issue tends to be that with AI it can be done in a fraction of the time and you can get a site that takes somewhat longer to identify as being fake.
Really, generative AI is here and there isn't really anything that can be done to stop it. However, it is a technology that has a ceiling in terms of how far it can go without a fundamental change in how it's built.
(Score: 4, Insightful) by khallow on Thursday May 29, @12:22PM (1 child)
As the saying goes:
Don't bet the farm. Most of this stuff is privately traded too.
Having said that, I do find it interesting how far some of these companies have fallen. For example, in the early part of the 2000s decade Google could do no wrong (much less evil), and people were peering at the tea leaves to see what amazing conquest they would do next. Now? It's seen as another poster child for enshitification.
(Score: 5, Interesting) by zocalo on Thursday May 29, @12:51PM
(Score: 2) by Beryllium Sphere (r) on Friday May 30, @04:48AM
"Markets can stay irrational longer than you can remain solvent". Forbes said once that professional short sellers look for stocks where a "train wreck" will happen at a predictable time.
(Score: 2) by sjames on Saturday May 31, @04:30AM (1 child)
Or as a somewhat colorful friend would say "They're getting woozy huffing their own ass gas".
(Score: 0) by Anonymous Coward on Saturday May 31, @04:42AM
Also see: Trump administration.
(Score: 0) by Anonymous Coward on Tuesday June 03, @04:34AM
Like everything else that's new, its just part of the maturity process. As more people use it in anger, people will start to discover gaps and some will start to abuse those, but with a steady investment and good stewardship it will evolve.
So with AI in particular, investment in it is bonkers - off the rails, so no issues there. How about stewardship? Well back to the investment bit, there's so much at stake there's massive incentives here to keep the ship sailing on course.
Just because you're seeing cracks in its usage as people learn to abuse and poison the model doesn't mean it will "Collapse". Naysayers are naysayers, they'll point to every little crack and scream doomsday.
Are at the point of no return in some slippery slope? I don't think so.
PS: people seem to think AI creates problems like fake papers and people gaming processes. Guess what that's been going on long before AI. People pay people to do homeworks and write project papers, organisations outsource/sub-contract who in turn outsource & sub-contract to do stuff all the time. Its not specific to AI. Welcome to reality. If anything it forces us to change our system. Schools need a better way to create resourceful members of society than folks that can rote memorize and churn out documentation, etc.