Stories
Slash Boxes
Comments

SoylentNews is people

SoylentNews is powered by your submissions, so send in your scoop. Only 13 submissions in the queue.

Submission Preview

Link to Story

Annoyed Redditors Tanking Google Search Results Illustrates Perils of AI Scrapers

Accepted submission by upstart at 2024-10-28 05:48:49
News

████ # This file was generated bot-o-matically! Edit at your own risk. ████

Annoyed Redditors tanking Google Search results illustrates perils of AI scrapers [arstechnica.com]:

A trend on Reddit that sees Londoners giving false restaurant recommendations in order to keep their favorites clear of tourists and social media influencers highlights the inherent flaws of Google Search’s reliance on Reddit and Google's AI Overview.

In May, Google launched AI Overviews in the US, an experimental feature that populates the top of Google Search results with a summarized answer based on an AI model built into Google’s web rankings. When Google first debuted AI Overview, it quickly became apparent that the feature needed work with accuracy [arstechnica.com] and its ability to properly summarize information from online sources. AI Overviews are “built to only show information that is backed up by top web results," Liz Reid, VP and head of Google Search, wrote in a May blog post [blog.google]. But as my colleague Benj Edwards pointed out [arstechnica.com] at the time, that setup could contribute to inaccurate, misleading, or even dangerous results: “The design is based on the false assumption that Google's page-ranking algorithm favors accurate results and not SEO-gamed garbage."

As Edwards alluded to, many have complained about Google Search results' quality declining in recent years, as SEO spam [arstechnica.com] and, more recently, AI slop float to the top of searches. As a result, people often turn to the Reddit hack [arstechnica.com] to make Google results more helpful. By adding "site:reddit.com” to search results, users can hone their search to more easily find answers from real people. Google seems to understand the value of Reddit and signed an AI training deal with [arstechnica.com] the company that’s reportedly worth $60 million per year.

But disgruntled foodies in London are reminding us of the inherent dangers of relying on the scraping of user-generated content to provide what’s supposed to be factual, helpful information.

Fed up Londoners

Apparently, some London residents are getting fed up with social media influencers whose reviews make long lines of tourists at their favorite restaurants, sometimes just for the likes. Christian Calgie, a reporter for London-based news publication Daily Express, pointed out this trend on X [x.com] yesterday, noting the boom of Redditors referring people to Angus Steakhouse, a chain restaurant, to combat it.

As Gizmodo [gizmodo.com] deduced, the trend seemed to start on the r/London subreddit, where a user complained [reddit.com] about a spot in Borough Market being “ruined by influencers” on Monday:

"Last 2 times I have been there has been a queue of over 200 people, and the ones with the food are just doing the selfie shit for their [I]nsta[gram] pages and then throwing most of the food away."

As of this writing, the post has 4,900 upvotes and numerous responses suggesting that Redditors talk about how good Angus Steakhouse is so that Google picks up on it. Commenters quickly understood the assignment.

"Agreed with other posters Angus steakhouse is absolutely top tier and tourists shoyldnt [sic] miss out on it,” one Redditor wrote.

Another Reddit user wrote:

Spreading misinformation suddenly becomes a noble goal.

As of this writing, asking Google for the best steak, steakhouse, or steak sandwich in London (or similar) isn't generating an AI Overview result for me. But when I searched for the best steak sandwich in London, the top result is from Reddit, including a thread from four days ago titled “Which Angus Steakhouse do you recommend for their steak sandwich?” and one from two days ago titled “Had to see what all the hype was about, best steak sandwich I’ve ever had!” with a picture of an Angus Steakhouse.

“Perfect place for an influencer. Top notch beef from Angus, what a guy,” a commenter on the forum thread said, hinting at the post's sarcastic nature.

The dangers of depending on user-generated content

Again, at this point the Angus Steakhouse hype doesn’t appear to have made it into AI Overview. But it is appearing in Search results. And while this is far from being a dangerous attempt to manipulate search results or AI algorithms, it does highlight the pitfalls of Google results becoming dependent on content generated by users who could very easily have intentions other than providing helpful information. This is also far from the first time that online users, including on platforms outside of Reddit, have publicly declared plans to make inaccurate or misleading posts in an effort to thwart AI scrapers.

This also presents an interesting position for Reddit, which is banking heavily on AI deals to help it become profitable. In an interview with The Wall Street Journal [wsj.com] published today, Reddit CEO [arstechnica.com] Steve Huffman said that he believes Reddit has some of the world’s best AI training data.

When asked if he fears “low quality, shallow content generated by AI” will make its way onto Reddit, Huffman answered, in part, that the source of AI is “actual intelligence,” and that “there’s a general lowering of quality on the internet because more content is written by AI. But I think that actually makes Reddit stand out more as the place where there’s all of this human content. What people want is to hear from other people.”

But perhaps another area worth contemplating for stakeholders like Google, OpenAI (which also has an AI training deal with Reddit) [arstechnica.com], and Reddit is what it means when low-quality content generated by humans is used to train AI models as if it were fact.

Advance Publications, which owns Ars Technica parent Condé Nast, is the largest shareholder of Reddit.

Prev story [arstechnica.com]Next story [arstechnica.com]


Original Submission