Stories
Slash Boxes
Comments

SoylentNews is people

posted by mrpg on Saturday June 30 2018, @04:58AM   Printer-friendly
from the 404 dept.

Vint Cerf, the godfather of the Internet, spoke in Sydney, Australia on Wednesday and issued a blunt call to action for a digital preservation regime for content and code to be quickly put in place to counter the existing throwaway culture that denies future generations an essential window into life in the past. He emphasized that this was especially needed for the WWW. Due to the volatile nature of electronic storage media as well as the format in which information is encoded, it is not possible to preserve digital material without prior planning and action.

[...] While the digital disappearance phenomenon is one which has so far mainly vexed official archivists and librarians for some years now, Cerf's take is that as everything goes from creation, the risk of accidental or careless memory loss increases correspondingly.

Archivists have for decades fought publicly for open document formats to hedge against proprietary and vendor risks – especially when classified material usually can only be made public after 30 to 50 years, sometimes longer.

From iTnews : Internet is losing its memory: Cerf


Original Submission

 
This discussion has been archived. No new comments can be posted.
Display Options Threshold/Breakthrough Mark All as Read Mark All as Unread
The Fine Print: The following comments are owned by whoever posted them. We are not responsible for them in any way.
  • (Score: 2) by Snotnose on Saturday June 30 2018, @05:51AM (14 children)

    by Snotnose (1623) on Saturday June 30 2018, @05:51AM (#700568)

    I recently had an issue mounting a cifs share from my NAS to my pi. My biggest problem was old, outdated information on webpages that had neither a date, nor a version of Linux. Let alone the Linux distro.

    IMHO, if a webpage doesn't have a date or Linux version prominently displayed then it needs to sink very quickly to the bottom of the innertubes.

    I'm pretty sure this issue is prevalent outside of Linux.

    FWIW, that page that implied "smbclient //backup/Public/Downloads /media/Downloads -d 5 gives lots of debugging info" needs to die in a fire. I wasted hours on that before realizing that was an invalid set of parameters to smbclient.

    --
    When the dust settled America realized it was saved by a porn star.
    Starting Score:    1  point
    Karma-Bonus Modifier   +1  

    Total Score:   2  
  • (Score: 2) by canopic jug on Saturday June 30 2018, @06:07AM (2 children)

    by canopic jug (3949) Subscriber Badge on Saturday June 30 2018, @06:07AM (#700572) Journal

    IMHO, if a webpage doesn't have a date or Linux version prominently displayed then it needs to sink very quickly to the bottom of the innertubes.

    Sink but not disappear, that is a hard problem for the search engines especially with malicious actors like SEOs being out there. Novelty is not the only context in which items on the net have value. There are many examples, especially when starting to get into historical research. However, a simple technical example are specifications and standards. Those don't change often yet are important. However, because standards can be many years old, the can and do get deleted even from institutions of record.

    --
    Money is not free speech. Elections should not be auctions.
    • (Score: 2) by frojack on Saturday June 30 2018, @07:18PM (1 child)

      by frojack (1554) on Saturday June 30 2018, @07:18PM (#700764) Journal

      There are many examples, especially when starting to get into historical research. ... However, because standards can be many years old, the can and do get deleted even from institutions of record.

      Doesn't matter.

      The Same fate awaits all of them. Sooner or later someone abandons the content, and its gone. The web is fundamentally the wrong instrument for knowledge preservation.

      Stuff disappears and nobody cares, and probably NOBODY SHOULD.

      Late on the afternoon of September 11th I started a program (I don't even remember what software it was anymore) that archived entire web sites. I pointed it all the major news sites, CNN, Fox, MSNBC. Each amassed all linked pages, many levels deep (5 or 8 or something). I ran each for hours till I filled my server. I then burned them to archival quality DVDs.

      I read them every once in a while, to remind myself how unreliable early reporting can be, but also how quickly early truths can be buried by political correctness. Searching for those same pages on line to day (or even looking for the same text phrases) reveals just how fleeting anything stored on the net is. Even the WayBack Machine [archive.org] is hopelessly incomplete. My private archive probably dies with me.

      As a kid in the 6th grade, I had to do a presentation report of an incident in World War II . For some reason I chose the (pocket battleship) Admiral Graf Spee incident. I had encyclopedias, library books, and newspaper archives at my disposal. The local newspapers of the day were just like the web. Some wrong and confusing stories, slowly being replaced by more complete and correct ones. But the point was the OLD and the NEWer were all there together in chronological order in the newspaper archives (microfiche in some cases). You could work your way through the story, separate facts from fiction, unmuddle the story (even as a 6th grader). Facts and chronology become clearer.

      Its very difficult to do that on the web. Historical information disappears. Conspiracy theorists come charging to the fore.

      --
      No, you are mistaken. I've always had this sig.
      • (Score: 0) by Anonymous Coward on Saturday June 30 2018, @09:49PM

        by Anonymous Coward on Saturday June 30 2018, @09:49PM (#700787)

        "My private archive probably dies with me."

        It will have to stop being private. Maybe not fully public, if you find a place that accepts it under such terms, like some libraries do with writers' manuscripts. As for fully public, https://archive.org/ [archive.org] accepts uploads, it's not just "the wayback machine". https://archiveteam.org/ [archiveteam.org] is a different project, maybe it can give you some hints about what to do.

  • (Score: 5, Touché) by NotSanguine on Saturday June 30 2018, @06:15AM (10 children)

    % man smbclient [die.net]

    Just a crazy thought.

    --
    No, no, you're not thinking; you're just being logical. --Niels Bohr
    • (Score: 1, Redundant) by Snotnose on Saturday June 30 2018, @06:27AM (9 children)

      by Snotnose (1623) on Saturday June 30 2018, @06:27AM (#700576)

      Yeah, that's how I found out the webpage was either dodgy or way out of date.

      I'm a technically savvy guy, spent way too many hours debugging a simple problem, and webpages that give bad advice need to sink to the bottom of the google rankings. Cuz google led me there.

      --
      When the dust settled America realized it was saved by a porn star.
      • (Score: 4, Insightful) by NotSanguine on Saturday June 30 2018, @07:16AM (8 children)

        Yeah, that's how I found out the webpage was either dodgy or way out of date.

        My point was that the man pages (already available on your system, or at least should be) should be the *first* place you go, not the last.

        There are definitely issues for which the man pages aren't so helpful, and some web pages are incredibly helpful.

        Too many people just google whatever it is they want and copypasta whatever the site tells them to do. I think that's the *wrong* way to do things.

        In that respect, I think that having the crappy pages (those with the most google ads on them?) near the top of search results may be better -- in that it may force people to actually think about what they're trying to accomplish and search more intelligently (no offense meant). I can't speak to the issue that you had, but if I wanted to do something with samba, samba.org would be my first choice -- after the man pages of course.

        That said, there's plenty of software whose online documentation is often worse than useless, even with common issues, and some enthusiast's blog does have the goods. VMWare is an excellent example of this.

        --
        No, no, you're not thinking; you're just being logical. --Niels Bohr
        • (Score: 0) by Anonymous Coward on Saturday June 30 2018, @08:49AM

          by Anonymous Coward on Saturday June 30 2018, @08:49AM (#700593)

          Even if a website is outdated, or not completely correct in the example it gives, it should give a hint to how to get to the answer you need.

        • (Score: 2, Flamebait) by crafoo on Saturday June 30 2018, @09:30AM (2 children)

          by crafoo (6639) on Saturday June 30 2018, @09:30AM (#700597)

          Modern Linux has trained users that man pages are shit. Because modern Linux man pages are largely piles of shit. Don't be surprised when users go to them as a last resort.

          • (Score: 1, Troll) by Arik on Saturday June 30 2018, @10:33AM (1 child)

            by Arik (4543) on Saturday June 30 2018, @10:33AM (#700610) Journal
            If the command is internal to linux you're probably going to want to look at info instead of man for current docs.
            --
            If laughter is the best medicine, who are the best doctors?
            • (Score: 2) by KiloByte on Saturday June 30 2018, @04:59PM

              by KiloByte (375) on Saturday June 30 2018, @04:59PM (#700716)

              Info files stopped being updated around 2000 or so, they're used by nothing but some GNU tools. And usually come with a non-free license so you don't even get them without jumping through extra hoops.

              --
              Ceterum censeo systemd esse delendam.
        • (Score: 2) by frojack on Saturday June 30 2018, @07:21PM

          by frojack (1554) on Saturday June 30 2018, @07:21PM (#700765) Journal

          He could have also simply used Google's date filter on his google search. He said he was "savvy". It doesn't rely on a date being printed on the page.

          --
          No, you are mistaken. I've always had this sig.
        • (Score: 2) by Snotnose on Sunday July 01 2018, @02:09AM (1 child)

          by Snotnose (1623) on Sunday July 01 2018, @02:09AM (#700829)

          My point was that the man pages (already available on your system, or at least should be) should be the *first* place you go, not the last.

          It's an embedded system with no man pages. If I was lucky I could "some_command --help", but that worked maybe 1/3 of the time. Google was how I figured out how to do stuff.

          --
          When the dust settled America realized it was saved by a porn star.
          • (Score: 2) by NotSanguine on Sunday July 01 2018, @08:33AM

            It's an embedded system with no man pages. If I was lucky I could "some_command --help", but that worked maybe 1/3 of the time. Google was how I figured out how to do stuff.

            A fair point. I could embark on a long diatribe about the shortcomings of UIs for embedded platforms, but you, apparently, are all too aware of such issues.

            Google, in my experience, doesn't do a very good job providing quality results to technical queries. I often find myself reformulating my searches to get what I'm looking for.

            Which brings us to the real issue with google and other search tools/information aggregators: they are not your friends and they do not exist to make your life easier. You (or more properly, your search habits and whatever other information they can glean from their interactions with you) are the product they sell to their actual customers. If they point you at poor quality information, it's because it benefits them in some way. Resolving your particular need/desire for specific information isn't even a consideration.

            And that touches on the larger point WRT TFA: we're losing stuff not because it isn't useful, but because it isn't making *someone* money.

            --
            No, no, you're not thinking; you're just being logical. --Niels Bohr
        • (Score: 0) by Anonymous Coward on Sunday July 01 2018, @02:19PM

          by Anonymous Coward on Sunday July 01 2018, @02:19PM (#700974)

          My point was that the man pages (already available on your system, or at least should be) should be the *first* place you go, not the last.

          In my experience that's more true for stuff like FreeBSD...

          Linux distros? Not so much.