Anonymous coders can be identified using stylometry and machine learning techniques applied to executable binaries:
Source code stylometry – analyzing the syntax of source code for clues about the author – is an established technique used in digital forensics. As the US Army Research Laboratory (ARL) puts it, "Stylometry research has proven that anonymous code contributors can be de-anonymized to reveal the original author, provided the author has published code before."
The technique can help identify virus makers as well as unmask the creators of anti-censorship tools and other outlawed programs. It has the potential to pierce the privacy that many programmers assume they have.
Source code is designed to be human-readable, but binaries – typically produced by compiling or assembling source code – have fewer characteristics that may suggest authorship. Toolchains can be instructed to strip out variable names, function names and other symbols and metadata – which may say something about the author – and alter the structure of code through optimization.
Nonetheless, the researchers – Aylin Caliskan, Fabian Yamaguchi, Edwin Dauber, Richard Harang, Konrad Rieck, Rachel Greenstadt and Arvind Narayanan – building on work described in a 2011 paper, demonstrate that binary files can be analyzed using machine-learning and stylometric techniques.
If you want to remain an anonymous coder, you'd better not contribute anything under your own name publicly:
When Coding Style Survives Compilation: De-anonymizing Programmers from Executable Binaries (arXiv:1512.08546 [cs.CR])
We evaluate our approach on data from the Google Code Jam, obtaining attribution accuracy of up to 96% with 100 and 83% with 600 candidate programmers. We present an executable binary authorship attribution approach, for the first time, that is robust to basic obfuscations, a range of compiler optimization settings, and binaries that have been stripped of their symbol tables. We perform programmer de-anonymization using both obfuscated binaries, and real-world code found "in the wild" in single-author GitHub repositories and the recently leaked Nulled.IO hacker forum. We show that programmers who would like to remain anonymous need to take extreme countermeasures to protect their privacy.
(Score: 2) by maxwell demon on Thursday March 22 2018, @12:35PM (1 child)
Note that if such programmer identification are used in forensics, such a code transformation program could also be used to create false evidence against someone: Analyze code written by the target, then take some malware and optimize it to be "recognized" as the target's work by the algorithm. Spread the malware a little bit, then run the analysis on it (with the well-known result) and arrest the target who has been "identified" as the author.
The Tao of math: The numbers you can count are not the real numbers.
(Score: 2) by DannyB on Thursday March 22 2018, @02:29PM
Exactly what I had in mind when I said: Imagine the possibilities!
To transfer files: right-click on file, pick Copy. Unplug mouse, plug mouse into other computer. Right-click, paste.