Stories
Slash Boxes
Comments

SoylentNews is people

SoylentNews is powered by your submissions, so send in your scoop. Only 16 submissions in the queue.

Submission Preview

Link to Story

How one YouTuber is trying to poison the AI bots stealing her content

Accepted submission by Freeman at 2025-01-31 15:10:35 from the dumpster fire dept.
News

https://arstechnica.com/ai/2025/01/how-one-youtuber-is-trying-to-poison-the-ai-bots-stealing-her-content/ [arstechnica.com]

If you've been paying careful attention to YouTube recently, you may have noticed the rising trend of so-called "faceless YouTube channels" [youtube.com] that never feature a visible human talking in the video frame. While some of these channels are simply authored by camera-shy humans, many more are fully automated through AI-powered tools [medium.com] to craft everything from the scripts and voiceovers to the imagery and music. Unsurprisingly, this is often sold [youtube.com] as a way to make a quick buck off the YouTube algorithm [linkedin.com] with minimal human effort.
[...]
YouTuber F4mi [youtube.com], who creates some excellent deep dives [youtube.com] on obscure technology [youtube.com], recently detailed her efforts [youtube.com] "to poison any AI summarizers that were trying to steal my content to make slop." The key to F4mi's method is the .ass subtitle format [tcax.org], created decades ago as part of fansubbing software Advanced SubStation Alpha.
[...]
For each chunk of actual text in her subtitle file, she also inserted "two chunks of text out of bounds using the positioning feature of the .ass format, with their size and transparency set to zero so they are completely invisible."

In those "invisible" subtitle boxes, F4mi added text from public domain works (with certain words replaced with synonyms to avoid detection) or her own LLM-generated scripts full of completely made-up facts.
[...]
F4mi says that advanced models like ChatGPT o1 were sometimes able to filter out the junk and generate an accurate summary of her videos despite this. With a little scripting work, though, an .ass file can be subdivided into individual timestamped letters, whose order can be scrambled in the file itself while still showing up correctly in the final video. That should create a difficult (though not impossible) puzzle for even advanced AIs to make sense of.
[...]
F4mi notes that "some people were having their phone crash due to the subtitles being too heavy," showing there is a bit of overhead cost to this kind of mischief.

F4mi also notes in her video that this method is far from foolproof. For one, tools like OpenAI's Whisper that actually listen to the audio track can still generate usable transcripts without access to a caption file.
[...]
Still, F4mi's small effort here is part of a larger movement that's fighting back against the AI scrapers [arstechnica.com] looking to soak up and repurpose everything on the public Internet [arstechnica.com].


Original Submission