SoylentNews Comments | Which Compression Format to Use for Archiving?

Which Compression Format to Use for Archiving?

posted by martyb on Thursday September 12 2019, @07:22PM

from the it-depends dept.

Web developer Ukiah Smith wrote a blog post about which compression format to use when archiving. Obviously the algorithm must be lossless but beyond that he sets some criteria and then evaluates how some of the more common methods line up.

After some brainstorming I have arrived with a set of criteria that I believe will help ensure my data is safe while using compression.
The compression tool must be opensource.
The compression format must be open.
The tool must be popular enough to be supported by the community.
Ideally there would be multiple implementations.
The format must be resilient to data loss.
Some formats I am looking at are zip, 7zip, rar, xz, bzip2, tar.

He closes by mentioning error correction. That has become more important than most acknowledge due to the large size of data files, the density of storage, and the propensity for bits to flip.

Original Submission

This discussion has been archived. No new comments can be posted.

Which Compression Format to Use for Archiving? | Log In/Create an Account | Top | 100 comments | Search Discussion

The Fine Print: The following comments are owned by whoever posted them. We are not responsible for them in any way.

Re:None(Score: 0) by Anonymous Coward on Friday September 13 2019, @08:34PM

by Anonymous Coward on Friday September 13 2019, @08:34PM (#893842)

As an outside observer, I still don't get what your point is. Part of the problem, I think, is that you have an idea of what the PKZIP specifies, but don't actually know, or that you have insufficiently specified how your version is different from zip. Zip files have individual headers located at the start of each compressed file that contains all the information necessary to decompress, verify, and extract that particular file. In addition, there is also the central directory trailer that contains all the information necessary to decompress, verify, and extract each and every file. In the event the trailer is trashed, you can still iterate the file and decompress it; if a file header is trashed, you can use the directory to decompress it. Worst case scenario, you just look for the next magic "PK\x03\x04" "PK\x05\x06" or "PK\x07\x08" in case both the trailer and previous file header is trashed.
What, exactly, are you proposing that is different or better?

Parent

Moderator Help

SoylentNews

SoylentNews is people

Navigation

Sections

SoylentNews

Which Compression Format to Use for Archiving?

Re:None(Score: 0) by Anonymous Coward on Friday September 13 2019, @08:34PM