Stories
Slash Boxes
Comments

SoylentNews is people

posted by Fnord666 on Sunday September 20 2020, @05:20PM   Printer-friendly
from the bitrot dept.

David Rosenthal discusses the last 25 years of digital preservation efforts in regards to academic journals. It's a long-standing problem and discontinued journals continue to disappear from the Internet. Paper, microfilm, and microfiche are slow to degrade and are decentralized and distributed. Digital media are quick to disappear and the digital publications are usually only in a single physical place leading to single point of failure. It takes continuous, unbroken effort and money to keep digital publications accessible even if only one person or institution wishes to retain acccess. He goes into the last few decades of academic publishing and how we got here and then brings up 4 points abuot preservation, especially in regards to Open Access publishing.

Lesson 1: libraries won't pay enough to preserve even subscription content, let alone open-access content.

[...] Lesson 2: No-one, not even librarians, knows where most of the at-risk open-access journals are.

[...] Lesson 3: The production preservation pipeline must be completely automated.

[...] Lesson 4: Don't make the best be the enemy of the good. I.e. get as much as possible with the available funds, don't expect to get everything.

He posits that focus should be on the preservation of the individual articles, not the journals as units.

Previously:
(2020) Internet Archive Files Answer and Affirmative Defenses to Publisher Copyright Infringement Lawsuit
(2018) Vint Cerf: Internet is Losing its Memory
(2014) The Importance of Information Preservation


Original Submission

 
This discussion has been archived. No new comments can be posted.
Display Options Threshold/Breakthrough Mark All as Read Mark All as Unread
The Fine Print: The following comments are owned by whoever posted them. We are not responsible for them in any way.
  • (Score: 2) by darkfeline on Sunday September 20 2020, @10:48PM (3 children)

    by darkfeline (1030) on Sunday September 20 2020, @10:48PM (#1054108) Homepage

    > It takes continuous, unbroken effort and money to keep digital publications accessible even if only one person or institution wishes to retain acccess.

    This sounds like it was paid for by the academic journal cabal. Make it legal to share and copy these, and let all the universities and researchers set up torrent seedboxes. I doubt all of the academic papers in the world together take up more space than a modern high quality full length movie.

    --
    Join the SDF Public Access UNIX System today!
    Starting Score:    1  point
    Karma-Bonus Modifier   +1  

    Total Score:   2  
  • (Score: 3, Interesting) by HiThere on Monday September 21 2020, @01:31AM

    by HiThere (866) Subscriber Badge on Monday September 21 2020, @01:31AM (#1054172) Journal

    Anything that relies on dynamically maintained documents is dubious for archival data. CDs are better than DVDs, because they are more robust against damage/deterioration.

    The problem is the difference between easy access and good archival quality, and the answer should be "use different media". Also ANYTHING that depends on encrypted keys being kept available is right out the window. It's totally useless for archival data. (Even if you can break the encryption, it makes it a lot more subject to errors causing the whole thing to be unreadable.)

    The problem is CDs aren't stable over long periods of time. They're good for multiple decades if handled carefully, but they are inherently unstable, so they probably won't hold up for a century even in ideal conditions. (This isn't inherently true, but it's true for the versions that could be written by a home computer.) Microfiche were a lot better in this regard, but reading them by computer was a real problem.

    The thing is, there hasn't been a lot of work done on producing archival quality media. There's little reward when you produce it, because most customers are more interested in ease of use, and lasting "long enough". Currently probably the best choice for large quantities of data is removable disk drives, but that's hardly archival quality. It lasts a decade or two if there aren't any unexpected problems. After that recovering the data is likely to be a major project, requiring opening the sealed drive, replacing the lubricants, and resealing it...at best.

    --
    Javascript is what you use to allow unknown third parties to run software you have no idea about on your computer.
  • (Score: 2) by canopic jug on Monday September 21 2020, @03:17AM (1 child)

    by canopic jug (3949) Subscriber Badge on Monday September 21 2020, @03:17AM (#1054209) Journal

    I agree that it should be legal to share, copy, and re-distribute articles indefinitely. Torrents (when not centralized) would be one easy, currenly existing publication technology, but what mechanism do you propose to ensure the authenticity and general integrity of said documents? The situation we have now is that they are sourced from a single web site. While that ensures the authenticity it also introduces a single point of failure. If we encourage a distributed model, which we should and is long over due, then you have the problem of making sure that the article and its contents have not changed either by accident or on purpose.

    --
    Money is not free speech. Elections should not be auctions.
    • (Score: -1, Troll) by Anonymous Coward on Monday September 21 2020, @05:12AM

      by Anonymous Coward on Monday September 21 2020, @05:12AM (#1054246)

      Blockchain! That's the answer to everything!