If you've been paying careful attention to YouTube recently, you may have noticed the rising trend of so-called "faceless YouTube channels" that never feature a visible human talking in the video frame. While some of these channels are simply authored by camera-shy humans, many more are fully automated through AI-powered tools to craft everything from the scripts and voiceovers to the imagery and music. Unsurprisingly, this is often sold as a way to make a quick buck off the YouTube algorithm with minimal human effort.
[...]
YouTuber F4mi, who creates some excellent deep dives on obscure technology, recently detailed her efforts "to poison any AI summarizers that were trying to steal my content to make slop." The key to F4mi's method is the .ass subtitle format, created decades ago as part of fansubbing software Advanced SubStation Alpha.
[...]
For each chunk of actual text in her subtitle file, she also inserted "two chunks of text out of bounds using the positioning feature of the .ass format, with their size and transparency set to zero so they are completely invisible."In those "invisible" subtitle boxes, F4mi added text from public domain works (with certain words replaced with synonyms to avoid detection) or her own LLM-generated scripts full of completely made-up facts.
[...]
F4mi says that advanced models like ChatGPT o1 were sometimes able to filter out the junk and generate an accurate summary of her videos despite this. With a little scripting work, though, an .ass file can be subdivided into individual timestamped letters, whose order can be scrambled in the file itself while still showing up correctly in the final video. That should create a difficult (though not impossible) puzzle for even advanced AIs to make sense of.
[...]
F4mi notes that "some people were having their phone crash due to the subtitles being too heavy," showing there is a bit of overhead cost to this kind of mischief.F4mi also notes in her video that this method is far from foolproof. For one, tools like OpenAI's Whisper that actually listen to the audio track can still generate usable transcripts without access to a caption file.
[...]
Still, F4mi's small effort here is part of a larger movement that's fighting back against the AI scrapers looking to soak up and repurpose everything on the public Internet.
(Score: 4, Interesting) by Mojibake Tengu on Monday February 03, @06:03AM (8 children)
Girl's clever, I admit.
Data poisoning as a protection against AIs' ability to process it adequately is the next big thing soon. Very soon.
Broken file formats, extreme utilization of traditional but incomplete specifications, nonlinear structuring, unbound arithmetic, hyperoperations, time traps.
People will reinvent lost forgotten stuff. Expect troubles ahead.
Welcome to the Desert of the Real.
Rust programming language offends both my Intelligence and my Spirit.
(Score: 4, Touché) by c0lo on Monday February 03, @07:09AM
Not to worry, we can easily always fallback on human mediated pure (as in "not even transformative") theft Disney stole my artwork and sold it in their parks - Update after 2 1/2 years [youtube.com]
https://www.youtube.com/watch?v=aoFiw2jMy-0 https://soylentnews.org/~MichaelDavidCrawford
(Score: 3, Interesting) by c0lo on Monday February 03, @07:42AM (6 children)
For this one, it only takes the implementation of a proper .ass format "interpreter" to feed AI with the same thing that humans receive in their input.
It's a hack that exploits a vulnerability - plug the latter and make the first one irrelevant - probably a couple manmonths as the worst cost
In the general case, I can't imagine a failsafe mechanism that poisons the input of an AI but is harmless for humans.
https://www.youtube.com/watch?v=aoFiw2jMy-0 https://soylentnews.org/~MichaelDavidCrawford
(Score: 3, Funny) by Dr Spin on Monday February 03, @08:08AM (5 children)
Some of my websites have pages which generate endless simulations of a crash with a core-dump.
I originally did this to handle Russian hackers, but it should work quite well on robots too.
You only need one instance, as you can put links all over the place with names that are obviously not relevant to
the content intended for people who wanted the real content.
Warning: Opening your mouth may invalidate your brain!
(Score: 0) by Anonymous Coward on Monday February 03, @09:44AM (4 children)
You should also have a file somewhere called Password.txt that is specifically excluded by Robots.txt
(Score: 2) by c0lo on Monday February 03, @11:29AM (3 children)
Lemme guess, I should also include it into a XML Google sitemap [google.com] and link to it from the site's index.htm page (and, of course, rant that nobody should use the .html extension).
https://www.youtube.com/watch?v=aoFiw2jMy-0 https://soylentnews.org/~MichaelDavidCrawford
(Score: 2) by dwilson on Monday February 03, @06:37PM (2 children)
Care to elaborate on why? As I understood it, the three-letter limit on file extensions is a holdover from the bad old days of FAT, MS-DOS, and Windows 95. With modern filesystems and operating systems, why does it really matter if I use a four-letter extension? If the relevant spec supports either, then I see no issue.
- D
(Score: 3, Interesting) by Dr Spin on Monday February 03, @07:06PM
the three-letter limit on file extensions is a holdover from the DEC PDP8 -> PDP11 transition:
RSX11 Used three 5-bit characters packed into a 16-bit word, with one remaining bit to say whether they were
all upper case or all lower case.
If I remember right (after 65 years) the upper case version was later used for directory names with lower case
for file names - but that might not have been an actual DEC standard.
Warning: Opening your mouth may invalidate your brain!
(Score: 2) by c0lo on Monday February 03, @08:14PM
Because that's how FrontPage [microsoft.com] saves them, no doubt as God intended.
(did I really need to use /s in my comment to which you replied?)
https://www.youtube.com/watch?v=aoFiw2jMy-0 https://soylentnews.org/~MichaelDavidCrawford
(Score: 3, Interesting) by Username on Monday February 03, @03:44PM
Have two different subs lines of nonsense in monospace font, and when they're layered overtop of each other, it creates a normal sentence.
T e u c B o n o
h q i k r w F x
(Score: 2) by jman on Tuesday February 04, @04:57PM
The "quiz" was a nice ploy to get you to subscribe to their services.
They show you eight text messages and ask you to decide if it's a scam or not.
Most were obvious, but one in particular - number 7 - they got wrong. It was purportedly from a bank, asking asking you to either reply or click a link to log in.
Supposedly this one was not a scam, but you couldn't hover over the link to see where it really went, so potentially it could have been a scam.