Vint Cerf, the godfather of the Internet, spoke in Sydney, Australia on Wednesday and issued a blunt call to action for a digital preservation regime for content and code to be quickly put in place to counter the existing throwaway culture that denies future generations an essential window into life in the past. He emphasized that this was especially needed for the WWW. Due to the volatile nature of electronic storage media as well as the format in which information is encoded, it is not possible to preserve digital material without prior planning and action.
[...] While the digital disappearance phenomenon is one which has so far mainly vexed official archivists and librarians for some years now, Cerf's take is that as everything goes from creation, the risk of accidental or careless memory loss increases correspondingly.
Archivists have for decades fought publicly for open document formats to hedge against proprietary and vendor risks – especially when classified material usually can only be made public after 30 to 50 years, sometimes longer.
From iTnews : Internet is losing its memory: Cerf
(Score: 2) by canopic jug on Saturday June 30 2018, @06:07AM (2 children)
IMHO, if a webpage doesn't have a date or Linux version prominently displayed then it needs to sink very quickly to the bottom of the innertubes.
Sink but not disappear, that is a hard problem for the search engines especially with malicious actors like SEOs being out there. Novelty is not the only context in which items on the net have value. There are many examples, especially when starting to get into historical research. However, a simple technical example are specifications and standards. Those don't change often yet are important. However, because standards can be many years old, the can and do get deleted even from institutions of record.
Money is not free speech. Elections should not be auctions.
(Score: 2) by frojack on Saturday June 30 2018, @07:18PM (1 child)
The Same fate awaits all of them. Sooner or later someone abandons the content, and its gone. The web is fundamentally the wrong instrument for knowledge preservation.
Stuff disappears and nobody cares, and probably NOBODY SHOULD.
Late on the afternoon of September 11th I started a program (I don't even remember what software it was anymore) that archived entire web sites. I pointed it all the major news sites, CNN, Fox, MSNBC. Each amassed all linked pages, many levels deep (5 or 8 or something). I ran each for hours till I filled my server. I then burned them to archival quality DVDs.
I read them every once in a while, to remind myself how unreliable early reporting can be, but also how quickly early truths can be buried by political correctness. Searching for those same pages on line to day (or even looking for the same text phrases) reveals just how fleeting anything stored on the net is. Even the WayBack Machine [archive.org] is hopelessly incomplete. My private archive probably dies with me.
As a kid in the 6th grade, I had to do a presentation report of an incident in World War II . For some reason I chose the (pocket battleship) Admiral Graf Spee incident. I had encyclopedias, library books, and newspaper archives at my disposal. The local newspapers of the day were just like the web. Some wrong and confusing stories, slowly being replaced by more complete and correct ones. But the point was the OLD and the NEWer were all there together in chronological order in the newspaper archives (microfiche in some cases). You could work your way through the story, separate facts from fiction, unmuddle the story (even as a 6th grader). Facts and chronology become clearer.
Its very difficult to do that on the web. Historical information disappears. Conspiracy theorists come charging to the fore.
No, you are mistaken. I've always had this sig.
(Score: 0) by Anonymous Coward on Saturday June 30 2018, @09:49PM
"My private archive probably dies with me."
It will have to stop being private. Maybe not fully public, if you find a place that accepts it under such terms, like some libraries do with writers' manuscripts. As for fully public, https://archive.org/ [archive.org] accepts uploads, it's not just "the wayback machine". https://archiveteam.org/ [archiveteam.org] is a different project, maybe it can give you some hints about what to do.