Stories
Slash Boxes
Comments

SoylentNews is people

posted by takyon on Thursday May 07 2015, @04:53AM   Printer-friendly
from the infinite-monkeys dept.

In 1941, Jorge Luis Borge wrote The Library of Babel, a story which described an almost infinite library containing every possible combination of letters in a vast collection of 410-page books.

Jonathon Basile has spent six months learning how to make a virtual version that can generate every possible page of 3,200 characters:

The Library currently allows users to choose from about 104677 potential books. The site also features a search tool, which allows users to retrieve the location in the library of any known page of text. Any individual page of Hamlet or the Bible can be found in the library, but the possibility of finding any other page from the same work in the same volume is vanishingly small.

While the library contains every possible page, it does not yet hold every possible combination of those pages. If this restriction were lifted, Basile explains on the site, the library would house "every book that ever has been written, and every book that ever could be – including every play, every song, every scientific paper, every legal decision, every constitution, every piece of scripture, and so on".

Basile evokes the comprehensive nature of the library's "blind volumes", saying: "To take a recent example, the confidential documents leaked by Edward Snowden... will be there somewhere. It's only a matter of knowing where to look for them."

 
This discussion has been archived. No new comments can be posted.
Display Options Threshold/Breakthrough Mark All as Read Mark All as Unread
The Fine Print: The following comments are owned by whoever posted them. We are not responsible for them in any way.
  • (Score: 2) by kaganar on Thursday May 07 2015, @02:18PM

    by kaganar (605) on Thursday May 07 2015, @02:18PM (#179920)
    Similar to /dev/random, but with some important properties:
    • You can seek to wherever you like. (/dev/random doesn't support seeking at all)
    • The pages are indexed to allow for fast searching. (Try searching /dev/random for a long string. You'll be waiting for a while.)
    • The number of pages is finite. (Not so with /dev/random depending on how it works under the hood.)

    You may be wondering how the virtual Library of Babel can allow indexing and searching so efficiently where /dev/random does not. I'll be honest, I'm not sure exactly how the virtual library was implemented, but it seems probable that it works similar to the following:

    • The index of each page is reversibly hashed into its contents. This makes its contents appear random. This also guarantees each page is unique.
    • If you want to find the index for a full page of text, just reverse the hash of it to obtain the index of it.
    • This last property is put to good use by playing on our perception of what a "search engine" does. Suppose I search for a small word, much less than 3200 characters long. To find the index of pages that do contain the word, all I need to do is put whatever I want before and/or after the word, and then reverse that hash to reveal the index of the page. I can do this as many times as I like with whatever padding I want. For example, I could pad it with English words.

    Is it really a search engine? Well, yes, it did actually find some virtual pages in virtual books on virtual shelves in virtual rooms in the virtual library. :) It's just not a particularly useful search engine.

    Starting Score:    1  point
    Karma-Bonus Modifier   +1  

    Total Score:   2