Stories
Slash Boxes
Comments

SoylentNews is people

posted by Fnord666 on Sunday January 28 2018, @01:49PM   Printer-friendly
from the augmented-intelligence dept.

Arthur T Knackerbracket has found the following story:

Greg Kondrak, a computer scientist from University of Alberta's AI lab, claims to have begun decoding the mystery behind the unknown text with his novel algorithm, CTVNews reported.

[...] It is believed that the manuscript is somehow related to women's health but there is no solid clue, according to the report. People have made wild guesses regarding the code, with at least eight making firm claims – only to be debunked later on.

Kondark, however, took a different approach towards solving the problem – artificial intelligence. "Once you see it, once you find out the mystery, this is a natural human tendency to solve the puzzle," the computer scientist told CTVNews. "I was intrigued and thought I could contribute something new."

He and his co-author Bradley Hauer combined novel AI algorithms with statistical procedures to identify and translate the language. The approach, which had been used to translate United Nations Universal Declaration of Human Rights in 380 languages, came in handy and suggested the language was Hebrew, albeit with critical tweaks.

They found that the letters in every word had been reordered and the vowels were dropped in the code. The first complete sentence which the AI decrypted read, "She made recommendations to the priest, man of the house and me and people." One section of the text carries words that translate into "farmer", "light", "air", and "fire".

The translated line could be the starting of something big but it is a long way to go for Kondark, who stresses on the need of complementary human assistance. However, it is not clear how accurate the translation really is.

"Somebody with very good knowledge of Hebrew and who's a historian at the same time could take this evidence and follow this kind of clue," he said while highlighting the need of someone who could make sense of the translated text.

For those who may not be familiar with the manuscript, see Voynich Manuscript at Wikipedia, or read it yourself at archive.org (Javascript required).


Original Submission

 
This discussion has been archived. No new comments can be posted.
Display Options Threshold/Breakthrough Mark All as Read Mark All as Unread
The Fine Print: The following comments are owned by whoever posted them. We are not responsible for them in any way.
  • (Score: 4, Interesting) by FatPhil on Sunday January 28 2018, @03:04PM (6 children)

    by FatPhil (863) <reversethis-{if.fdsa} {ta} {tnelyos-cp}> on Sunday January 28 2018, @03:04PM (#629463) Homepage
    "The approach, which had been used to translate United Nations Universal Declaration of Human Rights in 380 languages..."

    380 *known* languages, this is a different problem domain.

    "suggested the language was Hebrew"

    Just looking at the texts, that seems unlikely. I'd like to see the entropy measurements of the letters, digrams, trigrams, and word forms, and compare them to a range of languages. I just don't imagine it having the same statistics as Hebrew.

    It's hard to *disprove* such claims though. And I'm an ardent Popperite.

    They should start with /tabula rasa/, run their system over the bulk of the text (but nothing part from the text), and then be offered snippets of text from a random page, and see if the trained system can tell you whether those words are next to a picture of a man, a woman, a beast, a plant, or whatever. If they can guess the image better than random, maybe they've got something. That presumes that the text and the images are correlated, of course. If the images were necessary for the training, then that changes things, of course, but that's not insurmountable. Their program just has to compete faivourably (and statistically significantly) against a naive program that is only told which words appear near which images, and uses Bayes to estimate probabilities of images appearing near the given test words.
    --
    Great minds discuss ideas; average minds discuss events; small minds discuss people; the smallest discuss themselves
    Starting Score:    1  point
    Moderation   +2  
       Interesting=2, Total=2
    Extra 'Interesting' Modifier   0  
    Karma-Bonus Modifier   +1  

    Total Score:   4  
  • (Score: 4, Interesting) by zocalo on Sunday January 28 2018, @03:48PM (2 children)

    by zocalo (302) on Sunday January 28 2018, @03:48PM (#629476)
    I guess that depends on what they meant by being "in Hebrew", since the manuscript obviously isn't actually written in the Hebrew script but one unique to the document. It could be that the AI suggested "Hebrew" based on the breakdown of symbols counts, which are adjacent to which, etc., but it could also be one level abstracted from that and be based more on sentence structure, e.g. if you were to give a compilation of Yoda's sayings to analyse then it *should* come back with Japanese, even though they were actually spoken in English. People have done such analysiys on the manuscript before and the general consensus seems to be that there's a method behind the madness and it's not just random gibberish (it might still be well structured gibberish though), so it may well be as simple as taking the sentence structure of one language, writing it down word for word in another, then using some kind of substitution cypher to turn it into the Voynich script.

    Given the number of possible permutations though, I'm not holding out much hope that this analysis has got the right combination.
    --
    UNIX? They're not even circumcised! Savages!
    • (Score: 3, Funny) by requerdanos on Sunday January 28 2018, @05:00PM

      by requerdanos (5997) Subscriber Badge on Sunday January 28 2018, @05:00PM (#629503) Journal

      An "AI" to decode [The Voynich Manuscript | The Cosmic Microwave Background Radiation | Leetspeak | etc. ] ...


      # initialize
      if (exist: spaces) {
              delimiter=spaces;
      } else {
              delimiter=random(portion of input);
      }
      # process
      while more_document_exists {
              this_gibberish_word = next_word_until_delimiter();
              this_translated_word = word_this_nonsense_is_mathematically_least_dissimilar_to(this_gibberish_word);
              add_to_output (this_translated_word)
      }
      #enjoy success

      I'd be surprised if the AI in TFA differs conceptually by much.

    • (Score: 2) by FatPhil on Sunday January 28 2018, @09:52PM

      by FatPhil (863) <reversethis-{if.fdsa} {ta} {tnelyos-cp}> on Sunday January 28 2018, @09:52PM (#629593) Homepage
      I was guessing that they came up with a "maybe the vowels are missing?" idea, and then just made a leap to "like hebrew", and ran with it.

      But just look at it as a tesselation of symbols, it bears no resemblence in dynamics to hebrew texts - I can't believe it has the same kind of statistical distribution. It looks like poetic latinate text (the amount of repetition would be unusual for prose).
      --
      Great minds discuss ideas; average minds discuss events; small minds discuss people; the smallest discuss themselves
  • (Score: 1) by tftp on Sunday January 28 2018, @06:30PM (1 child)

    by tftp (806) on Sunday January 28 2018, @06:30PM (#629529) Homepage
    I thought it's already solved. The plants are Mexican [newscientist.com], and the language is [a version of] nahuatl [voynichms.com], written in some old Spanish font.
  • (Score: 2) by darkfeline on Tuesday January 30 2018, @04:34AM

    by darkfeline (1030) on Tuesday January 30 2018, @04:34AM (#630189) Homepage

    > 380 *known* languages, this is a different problem domain.

    This is a different problem domain for a human, not necessarily so for an AI.

    The thing that you have got to understand is that the way AI "thinks" is fundamentally different from humans. It's like the difference between proving a theorem using geometry and proving the same theorem using linear algebra. The strategy, approach, and difficulties are going to be completely different. That's why an AI might mistake a poodle for a car, but a human might mistake a jar for a human face. Both have weaknesses, they're just completely different weaknesses.

    --
    Join the SDF Public Access UNIX System today!