Stories
Slash Boxes
Comments

SoylentNews is people

posted by martyb on Thursday September 12 2019, @07:22PM   Printer-friendly
from the it-depends dept.

Web developer Ukiah Smith wrote a blog post about which compression format to use when archiving. Obviously the algorithm must be lossless but beyond that he sets some criteria and then evaluates how some of the more common methods line up.

After some brainstorming I have arrived with a set of criteria that I believe will help ensure my data is safe while using compression.

  • The compression tool must be opensource.
  • The compression format must be open.
  • The tool must be popular enough to be supported by the community.
  • Ideally there would be multiple implementations.
  • The format must be resilient to data loss.

Some formats I am looking at are zip, 7zip, rar, xz, bzip2, tar.

He closes by mentioning error correction. That has become more important than most acknowledge due to the large size of data files, the density of storage, and the propensity for bits to flip.


Original Submission

 
This discussion has been archived. No new comments can be posted.
Display Options Threshold/Breakthrough Mark All as Read Mark All as Unread
The Fine Print: The following comments are owned by whoever posted them. We are not responsible for them in any way.
  • (Score: 4, Informative) by JoeMerchant on Thursday September 12 2019, @07:54PM (3 children)

    by JoeMerchant (3937) on Thursday September 12 2019, @07:54PM (#893297)

    At present, I'm getting a .zip file from our Windoze dev ops server, everything else I shuffle around in .tar.gz format because... it's easy.

    If you want to be "safe," make multiple copies.

    If you want to be "safer," keep those multiple copies separated as far as practical from one another.

    If you're worried about the efficiency of the compression algorithm - either you're in a very special high data volume industry, or you haven't noticed what's happened to storage prices, storage device sizes and transfer speeds in the last decade (same could have been said 10 and 20 years ago.)

    As for widespread availability of the compression/decompression tools, I believe I was using the same .zip algorithms to distribute software on floppy disks back in the '90s, and .tar.gz is about as ubiquitous as it gets in the Linux world. Maybe there are others, but if anybody using a standard OS or reasonably feature rich distro's base configuration needs to install a piece of software to open your archive, I'd say you're doing it wrong.

    --
    🌻🌻 [google.com]
    Starting Score:    1  point
    Moderation   +2  
       Informative=2, Total=2
    Extra 'Informative' Modifier   0  
    Karma-Bonus Modifier   +1  

    Total Score:   4  
  • (Score: 2) by DannyB on Thursday September 12 2019, @09:07PM (1 child)

    by DannyB (5839) Subscriber Badge on Thursday September 12 2019, @09:07PM (#893345) Journal

    If you want to be "safer," keep those multiple copies separated as far as practical from one another.

    Opposite sides of the drive platter, outermost cylinder, is as far as they can be separated without leaving the confines of the drive enclosure. If the drive is SSD, then cylinders and far apart and seek time become confusing.

    Off site backups have higher latency to access. But hopefully you never need to access them in an emergency.

    --
    People today are educated enough to repeat what they are taught but not to question what they are taught.
    • (Score: 2) by JoeMerchant on Thursday September 12 2019, @09:36PM

      by JoeMerchant (3937) on Thursday September 12 2019, @09:36PM (#893371)

      I'm satisfied with two external hard drives, connected to the same machine. Sure, lightning strike, fire, etc. could take them both down at once, but... so far I have been robust against 3 drive failures in the last 20 years, and I haven't had to screw around with multi-site issues.

      If my data had "real" value, I'd do the remote site rsync, at least out to the garage. "Priceless" family photos just don't merit that much effort, in my life.

      --
      🌻🌻 [google.com]
  • (Score: 3, Insightful) by acid andy on Friday September 13 2019, @09:41AM

    by acid andy (1683) on Friday September 13 2019, @09:41AM (#893570) Homepage Journal

    or you haven't noticed what's happened to storage prices, storage device sizes and transfer speeds in the last decade (same could have been said 10 and 20 years ago)

    I have noticed. The trouble is I've also noticed that the things I want to store on them seem to keep increasing in size at roughly the same rate!

    --
    If a cat has kittens, does a rat have rittens, a bat bittens and a mat mittens?