Stories
Slash Boxes
Comments

SoylentNews is people

posted by cmn32480 on Wednesday November 18 2015, @10:28AM   Printer-friendly
from the like-looking-in-a-mirror dept.

A mimic function changes a file A so it assumes the statistical properties of another file B. That is, if p(t,A) is the probability of some substring t occuring in A, then a mimic function f, recodes A so that p(t,f(A)) approximates p(t,B) for all strings t of length less than some n. This paper describes the algorithm for computing mimic functions and compares the algorithm with its functional inverse, Huffman coding. It also provides a description of more robust mimic functions which can be defined using context-free grammars.

In his short story, "The Purloined Letter", Edgar Allan Poe describes a search by the police for an incriminating letter. The police ransack the house and pry open anything that might be hiding it, but they cannot find it. They look for hidden compartments, poke in mattresses and search for secret hiding spaces with no success. The detective, C. Auguste Dupin, goes to the house and finds the letter hidden in a different envelope in plain sight. He says, "But the more I reflected upon the daring, dashing and discriminating ingenuity, ... the more satisfied I became that, to conceal this letter, the Minister had resorted to the comprehensive and sagacious expedient of not attempting to conceal it at all."

In many ways, the practical cryptographer faces the same problem. Messages need to get from one place to another without being read. A traditional cryptographer tries to guarantee the letter's security by sealing the message in a mathematical safe and shipping the safe. There is no attempt made to hide the fact that it is a letter at all. The cryptanalyst attacking the message may or may not be able to break the code, but he has little problem finding and identifying the carrier.

Many of the histories written about the cryptography community, however contain stories of how the analysis of the message traffic alone lead to intelligence coups. Mimic functions hide the identity of a text by recoding a file so its statistical profile approximates the statistical profile of another file. They can convert any file to be statistically identical to, for instance, the contents of the USENET newsgroups like rec.humor or the classified section of the Sunday New York Times. Their contribution to security is largely founded upon the assumption that the explosion of information traffic makes it impossible for humans to read everything. Anyone watching must use computers outfitted with statistical profiles to weed the interesting data from the mundane.


Original Submission

 
This discussion has been archived. No new comments can be posted.
Display Options Threshold/Breakthrough Mark All as Read Mark All as Unread
The Fine Print: The following comments are owned by whoever posted them. We are not responsible for them in any way.
  • (Score: 3, Insightful) by FakeBeldin on Wednesday November 18 2015, @11:08AM

    by FakeBeldin (3360) on Wednesday November 18 2015, @11:08AM (#264795) Journal

    1. This concept sounds a lot like steganography [wikipedia.org] - basically, embedding a message within a text/photo/movie/sound piece. Famous example is watermarking - invisible in normal circumstances, but with the right detector clearly visible.

    (probably redundant for a good portion of SN'ers, but for the others:)
    2. Huffman encoding, in a nutshell: the most frequently occurring character in a file is encoded using the shortest possible bitstring. The 2nd most frequently occurring character is encoded using the 2nd shortest bitstring, etc.
    This is used in things like ZIP.

    3. One hallmark of a good cryptosystem is that the ciphertext (the encoded part) is indistinguishable* from a uniform random bitstring.
    To me, it seems that the choice of a mimic function depends on the input bit string and target language. So how does the receiver know how to invert the mimic function, without knowledge of at least some properties of the input bit string?
    It seems that these must somehow be communicated, which means that (Kerckhoff's principle [wikipedia.org]) the attacker will also know these properties. So how to ensure it's easy for the receiver to invert the mimic function, but hard for the attacker?

    * for whatever that means - and there are multiple definitions of "indistinguishable" in this context.

    On a sidenote, this gives me an interesting new approach to peer-review: read the abstract, formulate some questions, and then dive into the paper and see to what extent these are addressed.

    Starting Score:    1  point
    Moderation   +1  
       Insightful=1, Total=1
    Extra 'Insightful' Modifier   0  
    Karma-Bonus Modifier   +1  

    Total Score:   3