Stories
Slash Boxes
Comments

SoylentNews is people

SoylentNews is powered by your submissions, so send in your scoop. Only 18 submissions in the queue.
posted by Fnord666 on Thursday April 05 2018, @08:27PM   Printer-friendly
from the digital-fingerprints dept.

Zero-width characters are invisible, ‘non-printing’ characters that are not displayed by the majority of applications. F​or exam​ple, I’ve ins​erted 10 ze​ro-width spa​ces in​to thi​s sentence, c​an you tel​​l? (Hint: paste the sentence into Diff Checker to see the locations of the characters!). These characters can be used to ‘fingerprint’ text for certain users.

Well, the original reason isn’t too exciting. A few years ago I was a member of a team that participated in competitive tournaments across a variety of video games. This team had a private message board, used to post important announcements amongst other things. Eventually these announcements would appear elsewhere on the web, posted to mock the team and more significantly; ensuring the message board was redundant for sharing confidential information and tactics.

The security of the site seemed pretty tight so the theory was that a logged-in user was simply copying the announcement and posting it elsewhere. I created a script that allowed the team to invisibly fingerprint each announcement with the username of the user it is being displayed to.

I saw a lot of interest in zero-width characters from a recent post by Zach Aysan so I thought I’d publish this method here along with an interactive demo to share with everyone. The code examples have been updated to use modern JavaScript but the overall logic is the same.


Original Submission

 
This discussion has been archived. No new comments can be posted.
Display Options Threshold/Breakthrough Mark All as Read Mark All as Unread
The Fine Print: The following comments are owned by whoever posted them. We are not responsible for them in any way.
  • (Score: 2) by pipedwho on Friday April 06 2018, @09:02PM (2 children)

    by pipedwho (2032) on Friday April 06 2018, @09:02PM (#663547)

    Doesn’t get rid of phrase and word substitutions, which can be done manually (or to some degree automatically) on an A/B basis throughout the document. Then the system that allows access and stores both copies (or diff streams) finds the diff sections and gives you an A or B to encode a bit. The only way around this is to get two or more versions of the document and diff them yourself to remodulate those elements. Even retyping the document ‘in your own words’ may leak some inserted encoding (such as a modulated ‘fact’ like a percentage changed below its error bounds (eg 49.5 changed to 48.6), or a false but unimportant name/fact added to a list, etc).

    Starting Score:    1  point
    Karma-Bonus Modifier   +1  

    Total Score:   2  
  • (Score: 2) by Osamabobama on Friday April 06 2018, @10:43PM (1 child)

    by Osamabobama (5842) on Friday April 06 2018, @10:43PM (#663571)

    Some possible encoding can be stripped off with a round trip through a series of translators. That would leave numbers and names the same, though, presumably. If you have to try too hard to remove the data that may implicate you as the leaker, it may not be possible to leak useful information without becoming known.

    The logical end point of maximum actual risk with minimum detectable risk is where you are the only one with the document, but there is nothing encoded in the text. If you are going to play cat-and-mouse games, it's better to be the cat...

    --
    Appended to the end of comments you post. Max: 120 chars.
    • (Score: 2) by pipedwho on Saturday April 07 2018, @12:17AM

      by pipedwho (2032) on Saturday April 07 2018, @12:17AM (#663593)

      Or obtain the leaked info through someone else’s account, or a side channel without identifiable access. But, you’re right, it’s always bettter to be the cat.