Stories
Slash Boxes
Comments

SoylentNews is people

posted by martyb on Thursday September 12 2019, @07:22PM   Printer-friendly
from the it-depends dept.

Web developer Ukiah Smith wrote a blog post about which compression format to use when archiving. Obviously the algorithm must be lossless but beyond that he sets some criteria and then evaluates how some of the more common methods line up.

After some brainstorming I have arrived with a set of criteria that I believe will help ensure my data is safe while using compression.

  • The compression tool must be opensource.
  • The compression format must be open.
  • The tool must be popular enough to be supported by the community.
  • Ideally there would be multiple implementations.
  • The format must be resilient to data loss.

Some formats I am looking at are zip, 7zip, rar, xz, bzip2, tar.

He closes by mentioning error correction. That has become more important than most acknowledge due to the large size of data files, the density of storage, and the propensity for bits to flip.


Original Submission

 
This discussion has been archived. No new comments can be posted.
Display Options Threshold/Breakthrough Mark All as Read Mark All as Unread
The Fine Print: The following comments are owned by whoever posted them. We are not responsible for them in any way.
  • (Score: 3, Informative) by canopic jug on Friday September 13 2019, @05:44AM

    by canopic jug (3949) Subscriber Badge on Friday September 13 2019, @05:44AM (#893543) Journal

    The Xz format is inadequate for long-term archiving [nongnu.org]:

    There are several reasons why the xz compressed data format should not be used for long-term archiving, specially of valuable data. To begin with, xz is a complex container format that is not even fully documented. Using a complex format for long-term archiving would be a bad idea even if the format were well-designed, which xz is not. In general, the more complex the format, the less probable that it can be decoded in the future by a digital archaeologist. For long-term archiving, simple is robust.

    --
    Money is not free speech. Elections should not be auctions.
    Starting Score:    1  point
    Moderation   +1  
       Informative=1, Total=1
    Extra 'Informative' Modifier   0  
    Karma-Bonus Modifier   +1  

    Total Score:   3