Vint Cerf, the godfather of the Internet, spoke in Sydney, Australia on Wednesday and issued a blunt call to action for a digital preservation regime for content and code to be quickly put in place to counter the existing throwaway culture that denies future generations an essential window into life in the past. He emphasized that this was especially needed for the WWW. Due to the volatile nature of electronic storage media as well as the format in which information is encoded, it is not possible to preserve digital material without prior planning and action.
[...] While the digital disappearance phenomenon is one which has so far mainly vexed official archivists and librarians for some years now, Cerf's take is that as everything goes from creation, the risk of accidental or careless memory loss increases correspondingly.
Archivists have for decades fought publicly for open document formats to hedge against proprietary and vendor risks – especially when classified material usually can only be made public after 30 to 50 years, sometimes longer.
From iTnews : Internet is losing its memory: Cerf
(Score: 3, Informative) by c0lo on Saturday June 30 2018, @11:03AM (8 children)
For digital info, the real problem is the storage format.
A closed and unspecified digital format and any creation will be lost once the provider of the technology doesn't support it any more, irrespective of that creation being in the public domain or still under the copyright.
Unlike the paper, film, canvas or stone/metal
https://www.youtube.com/watch?v=aoFiw2jMy-0 https://soylentnews.org/~MichaelDavidCrawford
(Score: 2) by takyon on Saturday June 30 2018, @12:35PM (1 child)
All of those are subject to degradation and destruction. Particularly paper, film, and canvas. They also have a low information density.
Ultimately, we want to develop storage with an indefinite [soylentnews.org] lifespan [soylentnews.org] as well as an unprecedented density. DNA storage is on the table (455 exabytes per gram [wikipedia.org] or 215 petabytes per gram [wikipedia.org]?), and it can be replicated easily with PCR machines, but it doesn't look convenient. Wikipedia also gives a questionable estimate of 35 bits/electron ∴ 3 exabytes/in2 for electronic quantum holography. You could imagine some kind of crystalline medium being used for holographic storage while lasting many times longer than typical optical discs.
As for the problem of unspecified or lost formats, if we manage to get exabytes of storage into the hands of every individual on the planet, and zettabytes or yottabytes in larger organizations, we could easily spread lots of knowledge and culture around*, which can be copied endlessly over the internet or whatever networks exist at that point, irrespective of format. Which returns us back to the problem of copyright laws making it harder (though far from impossible) to archive and share our ongoing history.
*The entirety of compressed English Wikipedia can fit onto a 16 GB storage device. An exabyte is 62.5 million times larger.
[SIG] 10/28/2017: Soylent Upgrade v14 [soylentnews.org]
(Score: 2) by c0lo on Saturday June 30 2018, @02:27PM
So it's the current digital storage media. Until the linked tech get into mass production, that's speculation.
Uhu. Go read some Word documents from 15 years ago, with OLE-embedded [wikipedia.org] blobs. On 32bit CPU-es.
Imagine how well those exabytes will read after 100 years if you don't have any format spec.
https://www.youtube.com/watch?v=aoFiw2jMy-0 https://soylentnews.org/~MichaelDavidCrawford
(Score: 2) by maxwell demon on Saturday June 30 2018, @12:48PM (1 child)
Such as encryption with an undisclosed key?
Public domain content will much more likely be available in a well-documented, open format than proprietary content.
The Tao of math: The numbers you can count are not the real numbers.
(Score: 2) by frojack on Saturday June 30 2018, @06:27PM
ONLY if it is popular and replicated widely. And even then it ends up on someone's home machine which goes poof sooner or later.
Worse is if the public domain content is stored in, say, Some old version of Microsoft Word. Resurrecting these files can be a nightmare because the structure was never fully publicly documented. The new tendency among clueless people to trust Microsoft with all of their documents (Office 365) is even worse.
Yet local government records tend to be stored in electronic format all the time. Paper copies take up too much room. Its all on the server. They've got backup tapes. Its been decades since they tried to read those tapes, and the hardware has been replaced twice in the meantime. What could possibly go wrong?
No, you are mistaken. I've always had this sig.
(Score: 2) by darkfeline on Sunday July 01 2018, @08:32PM (3 children)
It's not that bad. I can play wav, mp3, mkv, vorbis, mpeg, webm, etc etc on Linux just fine. These formats and their decoders aren't going anywhere anytime soon. While in theory some of the formats have private patents, in practice I don't see it possible to enforce; you can't ban something that everyone and their grandma is using, and Congress critters, corrupt as they are, also have family photos and videos they won't take kindly to not being able to view.
I worry much more about reliable storage than reliable storage format. You can't really get reliable storage (protection from regular drive death and natural disasters and "oops") without building multiple dedicated storage racks around the world or resorting to the cloud.
Join the SDF Public Access UNIX System today!
(Score: 2) by c0lo on Monday July 02 2018, @01:29AM (2 children)
For now.
Define soon. 20 years from now? 50 years from now?
Besides, note that most of these formats are documented.
OOXML is public standard. And yet, it allows OLE blobs.
In practice, even open standards (much less patents) do not protect against "Embrace, Extend, Extinguish" practices.
https://www.youtube.com/watch?v=aoFiw2jMy-0 https://soylentnews.org/~MichaelDavidCrawford
(Score: 2) by darkfeline on Monday July 02 2018, @04:10AM (1 child)
Why do they have to be documented? I have a decoder that can play these files now, the decoder is going to continue to be able to play those files forever, and the decoder isn't going to just disappear. I can make effectively infinite copies of both the decoders and the files. Barring catastrophe, the ability to play said files is not going to disappear, and a catastrophe would endanger pretty much any storage format, analog or digital, especially if we lose written languages.
The format could literally be a black box which nobody knows and it doesn't matter in the slightest. No one would be able to create new files using the format, but that's irrelevant for archival purposes. I can still play them and that's all that matters.
Join the SDF Public Access UNIX System today!
(Score: 2) by c0lo on Monday July 02 2018, @05:01AM
Z80 code is no longer used by any mainstream computers - has been only 40 years since it was extensively used.
Any warranties you aren't going to have, as the mainstream default, quantum or neural computing in 40 years time with Neumann architectures a thing of the nostalgic past?
Forever is a long time, don't bet on it.
https://www.youtube.com/watch?v=aoFiw2jMy-0 https://soylentnews.org/~MichaelDavidCrawford