Stories
Slash Boxes
Comments

SoylentNews is people

posted by Fnord666 on Sunday January 28 2018, @01:49PM   Printer-friendly
from the augmented-intelligence dept.

Arthur T Knackerbracket has found the following story:

Greg Kondrak, a computer scientist from University of Alberta's AI lab, claims to have begun decoding the mystery behind the unknown text with his novel algorithm, CTVNews reported.

[...] It is believed that the manuscript is somehow related to women's health but there is no solid clue, according to the report. People have made wild guesses regarding the code, with at least eight making firm claims – only to be debunked later on.

Kondark, however, took a different approach towards solving the problem – artificial intelligence. "Once you see it, once you find out the mystery, this is a natural human tendency to solve the puzzle," the computer scientist told CTVNews. "I was intrigued and thought I could contribute something new."

He and his co-author Bradley Hauer combined novel AI algorithms with statistical procedures to identify and translate the language. The approach, which had been used to translate United Nations Universal Declaration of Human Rights in 380 languages, came in handy and suggested the language was Hebrew, albeit with critical tweaks.

They found that the letters in every word had been reordered and the vowels were dropped in the code. The first complete sentence which the AI decrypted read, "She made recommendations to the priest, man of the house and me and people." One section of the text carries words that translate into "farmer", "light", "air", and "fire".

The translated line could be the starting of something big but it is a long way to go for Kondark, who stresses on the need of complementary human assistance. However, it is not clear how accurate the translation really is.

"Somebody with very good knowledge of Hebrew and who's a historian at the same time could take this evidence and follow this kind of clue," he said while highlighting the need of someone who could make sense of the translated text.

For those who may not be familiar with the manuscript, see Voynich Manuscript at Wikipedia, or read it yourself at archive.org (Javascript required).


Original Submission

 
This discussion has been archived. No new comments can be posted.
Display Options Threshold/Breakthrough Mark All as Read Mark All as Unread
The Fine Print: The following comments are owned by whoever posted them. We are not responsible for them in any way.
  • (Score: 3, Interesting) by The Archon V2.0 on Monday January 29 2018, @03:00PM (1 child)

    by The Archon V2.0 (3887) on Monday January 29 2018, @03:00PM (#629818)

    > On p. 85 of the technical article, there's another histogram showing that if you drop vowels in other languages (as they do in Hebrew), Latin, Italian, and English all have matches for vocabulary in the 75-85% range, similar to Hebrew.

    So the biggest takeaway from this is that the script might be an abjad instead of an alphabet? Hasn't that hypothesis been floating around for a decade (likely more) already?

    Starting Score:    1  point
    Moderation   +1  
       Interesting=1, Total=1
    Extra 'Interesting' Modifier   0  
    Karma-Bonus Modifier   +1  

    Total Score:   3  
  • (Score: 3, Interesting) by AthanasiusKircher on Thursday February 01 2018, @09:07PM

    by AthanasiusKircher (5291) on Thursday February 01 2018, @09:07PM (#631692) Journal

    Sorry for the late reply, but I'm not sure that's the takeaway. Dropping the vowels makes it easier to match multiple words to a single set of symbols, thereby increasing the apparent "match" stats for a test like they did here.

    My point is that it's likely the "high match" percentage is just due to that basic statistical fact, i.e., that an abjad is likely to have a higher "hit rate" just due to random coincidence. And since Hebrew is frequently written as an abjad, perhaps that's one of the only reasons Hebrew was ranked higher in their algorithm. (Though I'm hoping they actually realized this, since it's a pretty basic feature likely to influence the stats. I'm hoping they did take that into account and that Hebrew still stood above other languages... though it's not clear from what I read that the difference is statistically big enough to justify their confidence that Hebrew is actually the language "encoded" in the manuscript.)