posted by Fnord666 on Sunday September 20 2020, @05:20PM   Printer-friendly
David Rosenthal discusses the last 25 years of digital preservation efforts in regards to academic journals. It's a long-standing problem and discontinued journals continue to disappear from the Internet. Paper, microfilm, and microfiche are slow to degrade and are decentralized and distributed. Digital media are quick to disappear and the digital publications are usually only in a single physical place leading to single point of failure. It takes continuous, unbroken effort and money to keep digital publications accessible even if only one person or institution wishes to retain acccess. He goes into the last few decades of academic publishing and how we got here and then brings up 4 points abuot preservation, especially in regards to Open Access publishing.

Lesson 1: libraries won't pay enough to preserve even subscription content, let alone open-access content.

[...] Lesson 2: No-one, not even librarians, knows where most of the at-risk open-access journals are.

[...] Lesson 3: The production preservation pipeline must be completely automated.

[...] Lesson 4: Don't make the best be the enemy of the good. I.e. get as much as possible with the available funds, don't expect to get everything.

He posits that focus should be on the preservation of the individual articles, not the journals as units.

  • (Score: 3, Interesting) by HiThere on Monday September 21 2020, @01:31AM

    by HiThere (866) Subscriber Badge on Monday September 21 2020, @01:31AM (#1054172) Journal

    Anything that relies on dynamically maintained documents is dubious for archival data. CDs are better than DVDs, because they are more robust against damage/deterioration.

    The problem is the difference between easy access and good archival quality, and the answer should be "use different media". Also ANYTHING that depends on encrypted keys being kept available is right out the window. It's totally useless for archival data. (Even if you can break the encryption, it makes it a lot more subject to errors causing the whole thing to be unreadable.)

    The problem is CDs aren't stable over long periods of time. They're good for multiple decades if handled carefully, but they are inherently unstable, so they probably won't hold up for a century even in ideal conditions. (This isn't inherently true, but it's true for the versions that could be written by a home computer.) Microfiche were a lot better in this regard, but reading them by computer was a real problem.

    The thing is, there hasn't been a lot of work done on producing archival quality media. There's little reward when you produce it, because most customers are more interested in ease of use, and lasting "long enough". Currently probably the best choice for large quantities of data is removable disk drives, but that's hardly archival quality. It lasts a decade or two if there aren't any unexpected problems. After that recovering the data is likely to be a major project, requiring opening the sealed drive, replacing the lubricants, and resealing best.

    Javascript is what you use to allow unknown third parties to run software you have no idea about on your computer.
