AI Haters Build Tarpits to Trap and Trick AI Scrapers That Ignore Robots.Txt

posted by Fnord666 on Friday January 31, @06:12PM

from the rotator dept.

https://arstechnica.com/tech-policy/2025/01/ai-haters-build-tarpits-to-trap-and-trick-ai-scrapers-that-ignore-robots-txt/

Last summer, Anthropic inspired backlash when its ClaudeBot AI crawler was accused of hammering websites a million or more times a day.
And it wasn't the only artificial intelligence company making headlines for supposedly ignoring instructions in robots.txt files to avoid scraping web content on certain sites. Around the same time, Reddit's CEO called out all AI companies whose crawlers he said were "a pain in the ass to block," despite the tech industry otherwise agreeing to respect "no scraping" robots.txt rules.
[...]
Shortly after he noticed Facebook's crawler exceeding 30 million hits on his site, Aaron began plotting a new kind of attack on crawlers "clobbering" websites that he told Ars he hoped would give "teeth" to robots.txt.
Building on an anti-spam cybersecurity tactic known as [tarpitting], he created Nepenthes, malicious software named after a carnivorous plant that will "eat just about anything that finds its way inside."
Aaron clearly warns users that Nepenthes is aggressive malware.
[...]
Tarpits were originally designed to waste spammers' time and resources, but creators like Aaron have now evolved the tactic into an anti-AI weapon.
[...]
It's unclear how much damage tarpits or other AI attacks can ultimately do. Last May, Laxmi Korada, Microsoft's director of partner technology, published a report detailing how leading AI companies were coping with poisoning, one of the earliest AI defense tactics deployed.
[...]
The only AI company that responded to Ars' request to comment was OpenAI, whose spokesperson confirmed that OpenAI is already working on a way to fight tarpitting.
"We're aware of efforts to disrupt AI web crawlers," OpenAI's spokesperson said. "We design our systems to be resilient while respecting robots.txt and standard web practices."
[...]
By releasing Nepenthes, he hopes to do as much damage as possible, perhaps spiking companies' AI training costs, dragging out training efforts, or even accelerating model collapse, with tarpits helping to delay the next wave of enshittification.

"Ultimately, it's like the Internet that I grew up on and loved is long gone," Aaron told Ars. "I'm just fed up, and you know what? Let's fight back, even if it's not successful. Be indigestible. Grow spikes."
[...]
Nepenthes was released in mid-January but was instantly popularized beyond Aaron's expectations after tech journalist Cory Doctorow boosted a tech commentator, Jürgen Geuter, praising the novel AI attack method on Mastodon. Very quickly, Aaron was shocked to see engagement with Nepenthes skyrocket.
"That's when I realized, 'oh this is going to be something,'" Aaron told Ars. "I'm kind of shocked by how much it's blown up."
[...]
When software developer and hacker Gergely Nagy, who goes by the handle "algernon" online, saw Nepenthes, he was delighted. At that time, Nagy told Ars that nearly all of his server's bandwidth was being "eaten" by AI crawlers.
Already blocking scraping and attempting to poison AI models through a simpler method, Nagy took his defense method further and created his own tarpit, Iocaine. He told Ars the tarpit immediately killed off about 94 percent of bot traffic to his site, which was primarily from AI crawlers.
[...]
Iocaine takes ideas (not code) from Nepenthes, but it's more intent on using the tarpit to poison AI models. Nagy used a reverse proxy to trap crawlers in an "infinite maze of garbage" in an attempt to slowly poison their data collection as much as possible for daring to ignore robots.txt.
[...]
Running malware like Nepenthes can burden servers, too. Aaron likened the cost of running Nepenthes to running a cheap virtual machine on a Raspberry Pi, and Nagy said that serving crawlers Iocaine costs about the same as serving his website.
[...]
Tarpit creators like Nagy will likely be watching to see if poisoning attacks continue growing in sophistication. On the Iocaine site—which, yes, is protected from scraping by Iocaine—he posted this call to action: "Let's make AI poisoning the norm. If we all do it, they won't have anything to crawl."

Related stories on SoylentNews:
Endlessh: an SSH Tarpit - 20190325

Original Submission

This discussion was created by Fnord666 (652) for logged-in users only, but now has been archived. No new comments can be posted.

AI Haters Build Tarpits to Trap and Trick AI Scrapers That Ignore Robots.Txt | Log In/Create an Account | Top | 7 comments | Search Discussion

The Fine Print: The following comments are owned by whoever posted them. We are not responsible for them in any way.

SoylentNews

SoylentNews is people

Navigation

Sections

SoylentNews

Log In

AI Haters Build Tarpits to Trap and Trick AI Scrapers That Ignore Robots.Txt

Related Stories

They Don't Know What Hit ThemThey Don't Know What Hit Them (Score: 5, Interesting) by Mojibake Tengu on Friday January 31, @07:05PM (1 child)

Re:They Don't Know What Hit Them(Score: 0, Interesting) by Anonymous Coward on Saturday February 01, @02:08AM

Mod +1 to article(Score: 4, Informative) by PiMuNu on Friday January 31, @09:49PM

You asked nicely, escalation is fineYou asked nicely, escalation is fine (Score: 5, Insightful) by Thexalon on Friday January 31, @10:02PM (1 child)

Re:You asked nicely, escalation is fine(Score: 3, Interesting) by aafcac on Saturday February 01, @01:19AM

on poisoning AI(Score: 3, Interesting) by khallow on Saturday February 01, @12:59AM

Why does OpenAI need to fight tarpitting?(Score: 5, Interesting) by Nobuddy on Saturday February 01, @03:37AM

SoylentNews

SoylentNews is people

Navigation

Sections

SoylentNews

Log In

Related Links

AI Haters Build Tarpits to Trap and Trick AI Scrapers That Ignore Robots.Txt

Related Stories

They Don't Know What Hit ThemThey Don't Know What Hit Them (Score: 5, Interesting) by Mojibake Tengu on Friday January 31, @07:05PM (1 child)

Re:They Don't Know What Hit Them(Score: 0, Interesting) by Anonymous Coward on Saturday February 01, @02:08AM

Mod +1 to article(Score: 4, Informative) by PiMuNu on Friday January 31, @09:49PM

You asked nicely, escalation is fineYou asked nicely, escalation is fine (Score: 5, Insightful) by Thexalon on Friday January 31, @10:02PM (1 child)

Re:You asked nicely, escalation is fine(Score: 3, Interesting) by aafcac on Saturday February 01, @01:19AM

on poisoning AI(Score: 3, Interesting) by khallow on Saturday February 01, @12:59AM

Why does OpenAI need to fight tarpitting?(Score: 5, Interesting) by Nobuddy on Saturday February 01, @03:37AM