SoylentNews Comments | Be Careful What You Copy: Invisibly Inserting Usernames Into Text

Be Careful What You Copy: Invisibly Inserting Usernames Into Text

posted by Fnord666 on Thursday April 05 2018, @08:27PM

from the digital-fingerprints dept.

Arthur T Knackerbracket has found the following story:

Zero-width characters are invisible, ‘non-printing’ characters that are not displayed by the majority of applications. For example, I’ve inserted 10 zero-width spaces into this sentence, can you tell? (Hint: paste the sentence into Diff Checker to see the locations of the characters!). These characters can be used to ‘fingerprint’ text for certain users.
Well, the original reason isn’t too exciting. A few years ago I was a member of a team that participated in competitive tournaments across a variety of video games. This team had a private message board, used to post important announcements amongst other things. Eventually these announcements would appear elsewhere on the web, posted to mock the team and more significantly; ensuring the message board was redundant for sharing confidential information and tactics.
The security of the site seemed pretty tight so the theory was that a logged-in user was simply copying the announcement and posting it elsewhere. I created a script that allowed the team to invisibly fingerprint each announcement with the username of the user it is being displayed to.
I saw a lot of interest in zero-width characters from a recent post by Zach Aysan so I thought I’d publish this method here along with an interactive demo to share with everyone. The code examples have been updated to use modern JavaScript but the overall logic is the same.

Original Submission

This discussion has been archived. No new comments can be posted.

Be Careful What You Copy: Invisibly Inserting Usernames Into Text | Log In/Create an Account | Top | 91 comments | Search Discussion

The Fine Print: The following comments are owned by whoever posted them. We are not responsible for them in any way.

Not showing my username Not showing my username (Score: 2) by NewNic on Thursday April 05 2018, @08:39PM (14 children)

by NewNic (6420) on Thursday April 05 2018, @08:39PM (#663093) Journal

I tried using the webpage https://umpox.github.io/zero-width-detection/ [github.io] and it did not show my username. When I pasted it into Diff Checker, there were no zero width characters.
I tried using Firefox and Chrome under Linux.

--
lib·er·tar·i·an·ism ˌlibərˈterēənizəm/ noun: Magical thinking that useful idiots mistake for serious political theory

Starting Score: 1 point

Karma-Bonus Modifier +1

Total Score: 2
Re:Not showing my username (Score: 1, Funny) by Anonymous Coward on Thursday April 05 2018, @08:51PM

by Anonymous Coward on Thursday April 05 2018, @08:51PM (#663100)

You need to be wearing your captain crunch secret decoder ring.

Parent
Re:Not showing my username (Score: 0) by Anonymous Coward on Thursday April 05 2018, @09:35PM

by Anonymous Coward on Thursday April 05 2018, @09:35PM (#663116)

Worked for me pasting into hexdump -C both in MSYS2/Windows and GNU/Linux.

Parent
Re:Not showing my username Re:Not showing my username (Score: 1) by speederaser on Thursday April 05 2018, @11:06PM (4 children)

by speederaser (4049) on Thursday April 05 2018, @11:06PM (#663168)

In Firefox and PaleMoon you can see them by right-clicking -> view page source. The inserted string shows as: "& # 8 2 0 3 ;" without the double quotes or the actual spaces inserted.

Parent
- Re:Not showing my username Re:Not showing my username (Score: 2) by maxwell demon on Friday April 06 2018, @06:41AM (3 children)
  
  by maxwell demon (1608) on Friday April 06 2018, @06:41AM (#663301) Journal
  
  The inserted string shows as: "& # 8 2 0 3 ;" without the double quotes or the actual spaces inserted.
  So in other words, it shows as “” — why not just write that?
  But then, this is not because of Firefox/Palemoon, but because that's what the site delivers to the browser. If the site had decided to deliver actual Unicode characters instead of HTML entities, then your browser would not show entities in the source.
  
  --
  The Tao of math: The numbers you can count are not the real numbers.
  
  Parent
  - Re:Not showing my username Re:Not showing my username (Score: 1) by speederaser on Friday April 06 2018, @04:10PM (2 children)
    
    by speederaser (4049) on Friday April 06 2018, @04:10PM (#663466)
    
    So in other words, it shows as “” — why not just write that?
    Because "preview" made it invisible when I did it that way, even when I used "Plain Old Text". Just like it appears in preview on this post.
    
    Parent
    - Re:Not showing my username (Score: 2) by maxwell demon on Friday April 06 2018, @04:53PM
      
      by maxwell demon (1608) on Friday April 06 2018, @04:53PM (#663482) Journal
      
      Hint: &
      
      --
      The Tao of math: The numbers you can count are not the real numbers.
      
      Parent
    - Re:Not showing my username (Score: 2) by Osamabobama on Friday April 06 2018, @09:55PM
      
      by Osamabobama (5842) on Friday April 06 2018, @09:55PM (#663555)
      
      My favorite is when the html sorcery is correct, so preview looks fine, but the text-entry window version of the comment also gets changed, so hitting Submit (or Preview, again) will post something else.
      For instance, if you want to show <i>html tags</i> in the post, you format your comment with escape characters so the correct tag shows up in the preview. However, the comment text also strips out the escape characters, so pressing Submit will then post the comment with the tags interpreted, rather than displayed.
      Each cycle is slightly different:
      
      &lti&gthtml tags&lt/i&gt
      <i>html tags</i>
      html tags
      
      --
      Appended to the end of comments you post. Max: 120 chars.
      
      Parent
Re:Not showing my username Re:Not showing my username (Score: 3, Interesting) by FatPhil on Friday April 06 2018, @06:29AM (5 children)

by FatPhil (863) <{pc-soylent} {at} {asdf.fi}> on Friday April 06 2018, @06:29AM (#663295) Homepage

Well, firstly, that diffchecker webpage is retarded - it did absolutely nothing when I enabled JS for the not-obviously-spammy domains, and is clearly the wrong tool for the job. Anyone who thinks that the best way of analysing a stream of data for embedded invisible characters is by diffing it with something is using a hammer on a screw. The obvious tool for the job is of course od(1).

$ echo -n 'For example, I’ve inserted 10 zero-width spaces into this sentence, can you tell?' | od -c
0000000 F 342 200 213 o r e x a m 342 200 213 p l
0000020 e , I 342 200 231 v e i n s 342 200 213
0000040 e r t e d 1 0 z e 342 200 213 r o
0000060 - w i d t h s p a 342 200 213 c e s
0000100 i n 342 200 213 t o t h i 342 200 213 s
0000120 s e n t e n c e , c 342 200 213 a
0000140 n y o u t e l 342 200 213 342 200 213 l
0000160 ?
0000161

Or if you just want to count them:

$ echo -n 'For example, I’ve inserted 10 zero-width spaces into this sentence, can you tell?' | tr -d '[[:print:]]' | wc -c
33

All of which means the author can't count.

However, you're misunderstanding him, he never claimed to have embedded your username in that sentence, only to have embedded some non-displaying characters.

--
Great minds discuss ideas; average minds discuss events; small minds discuss people; the smallest discuss themselves

Parent
- Re:Not showing my username Re:Not showing my username (Score: 3, Interesting) by maxwell demon on Friday April 06 2018, @06:55AM (1 child)
  
  by maxwell demon (1608) on Friday April 06 2018, @06:55AM (#663309) Journal
  
  Actually, you can just look at it with less. Then you even get the Unicode code numbers in a readable form:
  F<U+200B>or exam<U+200B>ple, I’ve ins<U+200B>erted 10 ze<U+200B>ro-width spa <U+200B>ces in<U+200B>to thi<U+200B>s sentence, c<U+200B>an you tel<U+200B> <U+200B>l?
  (Note that the Unicode code points are shown inverted, so you can distinguish them from an ASCII character sequence of the same form).
  
  --
  The Tao of math: The numbers you can count are not the real numbers.
  
  Parent
  - Re:Not showing my username (Score: 3, Interesting) by FatPhil on Friday April 06 2018, @08:00AM
    
    by FatPhil (863) <{pc-soylent} {at} {asdf.fi}> on Friday April 06 2018, @08:00AM (#663328) Homepage
    
    Good point. TMTOWTDT is good. But is this the Unix way? Personally, I don't believe that's less's job, it should be a pager with scrollback, and very little more - I don't even see a switch to turn it off, unless that's what -r is for, and in that case, it's terribly documented (non-ASCII utf-8 isn't control characters). And don't get me started on cat -t!
    
    At least less implemented the escaping functionality correctly, locale aware - when you unset LANG you'll get:
    F<E2><80><8B>or exam<E2><80><8B>ple, I<E2><80><99>ve ins<E2><80><8B>erted 10 ze<E2><80><8B>ro-width spa<E2><80><8B>ces in<E2><80><8B>to thi<E2><80><8B>s sentence, c<E2><80><8B>an you tel<E2><80><8B><E2><80><8B>l?
    
    Which turns unicode into moar garbage, which I think is fitting.
    
    --
    Great minds discuss ideas; average minds discuss events; small minds discuss people; the smallest discuss themselves
    
    Parent
- Re:Not showing my username Re:Not showing my username (Score: 2) by maxwell demon on Friday April 06 2018, @08:19AM (1 child)
  
  by maxwell demon (1608) on Friday April 06 2018, @08:19AM (#663330) Journal
  wc -c does not count characters, but bytes. With multi-byte character sets (like Unicode) both are not the same. From the wc man page:
  DESCRIPTION [...]
  -c, --bytes
  print the byte counts
  -m, --chars
  print the character counts
  With the correct option, you get:
  $ echo -n 'For example, I’ve inserted 10 zero-width spaces into this sentence, can you tell?' | tr -d '[[:print:]]' | wc -m 11
  Well, it's still one too many, right? Well, no:
  $ echo 'For example, I’ve inserted 10 zero-width spaces into this sentence, can you tell?' | tr -d '[[:print:]]' ’
  So tr considers that apostrophe as non-printable. It clearly is not a zero-width space, so there remain 10 zero-width spaces. Why is that? Well, let's look at it:
  $ echo -n '’' | xxd 0000000: e280 99 ...
  This is actually the following Unicode character:
  U+2019 RIGHT SINGLE QUOTATION MARK
  UTF-8: 0xE2 0x80 0x99
  Conclusions:
  The author can count.
  You don't know your tools.
  tr doesn't correctly classify Unicode characters.
  --
  The Tao of math: The numbers you can count are not the real numbers.
  Parent
  - Re:Not showing my username (Score: 2) by FatPhil on Friday April 06 2018, @08:49AM
    
    by FatPhil (863) <{pc-soylent} {at} {asdf.fi}> on Friday April 06 2018, @08:49AM (#663342) Homepage
    
    But:
    1) The author uses specific quotation marks as apostrophes. That's as big a mistake as miscounting would have been.
    2) Once I'd od'd it, which indeed is what I did first, I saw they were all 3-byte, so 33 bytes tells me exactly the same information as 11 characters. I would also have been interested in knowing about non-utf8 byte sequences, even if they would invalidate the stream (but Postel's law...)
    3) That's a weird one, I presume a standard library is used, and that should get things right (as the unicode consortium provide an explicit list of all the classes). Someone who gives a fuck about unicode should file a bug report. (So not me.)
    
    --
    Great minds discuss ideas; average minds discuss events; small minds discuss people; the smallest discuss themselves
    
    Parent
- Re:Not showing my username (Score: 0) by Anonymous Coward on Friday April 06 2018, @03:06PM
  
  by Anonymous Coward on Friday April 06 2018, @03:06PM (#663446)
  
  did anyone else think their username would show up?
  from an example from another site that was posted here that was described as a means of hiding spaces and not revealing usernames of special forum that used different but similar techniques to identify their leaker via controlled circumstances of logged in users of that site and not logged in users of some other site the author probably hasn't visited?
  i want to know if the writing is a bad example of instruction or if the expectation was not widespread
  
  Parent
Re:Not showing my username (Score: 2) by Rivenaleem on Friday April 06 2018, @01:02PM

by Rivenaleem (3400) on Friday April 06 2018, @01:02PM (#663404)

I tried using the webpage https://umpox.github.io/zero-width-detection/ [github.io] [github.io] and it did not show my username. When I pasted NewNic it into Diff Checker, there were no zero width characters.
I tried using Firefox and Chrome under Linux.
I dunno, it worked fine for me.

Parent

Moderator Help

SoylentNews

SoylentNews is people

Navigation

Sections

SoylentNews

Be Careful What You Copy: Invisibly Inserting Usernames Into Text

Not showing my username Not showing my username (Score: 2) by NewNic on Thursday April 05 2018, @08:39PM (14 children)

Re:Not showing my username (Score: 1, Funny) by Anonymous Coward on Thursday April 05 2018, @08:51PM

Re:Not showing my username (Score: 0) by Anonymous Coward on Thursday April 05 2018, @09:35PM

Re:Not showing my username Re:Not showing my username (Score: 1) by speederaser on Thursday April 05 2018, @11:06PM (4 children)

Re:Not showing my username Re:Not showing my username (Score: 2) by maxwell demon on Friday April 06 2018, @06:41AM (3 children)

Re:Not showing my username Re:Not showing my username (Score: 1) by speederaser on Friday April 06 2018, @04:10PM (2 children)

Re:Not showing my username (Score: 2) by maxwell demon on Friday April 06 2018, @04:53PM

Re:Not showing my username (Score: 2) by Osamabobama on Friday April 06 2018, @09:55PM

Re:Not showing my username Re:Not showing my username (Score: 3, Interesting) by FatPhil on Friday April 06 2018, @06:29AM (5 children)

Re:Not showing my username Re:Not showing my username (Score: 3, Interesting) by maxwell demon on Friday April 06 2018, @06:55AM (1 child)

Re:Not showing my username (Score: 3, Interesting) by FatPhil on Friday April 06 2018, @08:00AM

Re:Not showing my username Re:Not showing my username (Score: 2) by maxwell demon on Friday April 06 2018, @08:19AM (1 child)

Re:Not showing my username (Score: 2) by FatPhil on Friday April 06 2018, @08:49AM

Re:Not showing my username (Score: 0) by Anonymous Coward on Friday April 06 2018, @03:06PM

Re:Not showing my username (Score: 2) by Rivenaleem on Friday April 06 2018, @01:02PM

Starting Score:	1		point
Karma-Bonus Modifier		+1

Total Score:		2