Stories
Slash Boxes
Comments

SoylentNews is people

posted by martyb on Thursday June 28 2018, @02:42AM   Printer-friendly
from the tight-squeeze dept.

Submitted via IRC for BoyceMagooglyMonkey

Compressing your files is a good way to save space on your hard drive. At Dropbox's scale, it's not just a good idea; it is essential. Even a 1% improvement in compression efficiency can make a huge difference. That's why we conduct research into lossless compression algorithms that are highly tuned for certain classes of files and storage, like Lepton for jpeg images, and Pied-Piper-esque lossless video encoding. For other file types, Dropbox currently uses the zlib compression format, which saves almost 8% of disk storage.

We introduce DivANS, our latest open-source contribution to compression, in this blog post. DivANS is a new way of structuring compression programs to make them more open to innovation in the wider community, by separating compression into multiple stages that can each be improved independently:

Source: https://blogs.dropbox.com/tech/2018/06/building-better-compression-together-with-divans/


Original Submission

This discussion has been archived. No new comments can be posted.
Display Options Threshold/Breakthrough Mark All as Read Mark All as Unread
The Fine Print: The following comments are owned by whoever posted them. We are not responsible for them in any way.
(1)
  • (Score: 3, Interesting) by shortscreen on Thursday June 28 2018, @08:14AM (3 children)

    by shortscreen (2252) on Thursday June 28 2018, @08:14AM (#699720) Journal

    If they are trying to get good compression then why are they using speculative probability tables instead of counting the actual frequencies in the actual data? Doesn't zlib already do exactly that for each 32KB block?

    • (Score: 1, Informative) by Anonymous Coward on Thursday June 28 2018, @12:03PM

      by Anonymous Coward on Thursday June 28 2018, @12:03PM (#699773)

      I don't know if this is the real reason, but Dropbox received a patent [freshpatents.com] for image recompression "with an arithmetic coding that uses a sophisticated adaptive probability model."

    • (Score: 0) by Anonymous Coward on Thursday June 28 2018, @10:26PM

      by Anonymous Coward on Thursday June 28 2018, @10:26PM (#700008)

      This isn't about optimisation, it's about getting you to install another vector for the NSA.

    • (Score: 0) by Anonymous Coward on Friday June 29 2018, @01:17AM

      by Anonymous Coward on Friday June 29 2018, @01:17AM (#700053)

      Yet their compression probably beats zlib by 8-10%.

      zlib is actually one of the worse deflate compressors out there. It is usually used as the '0' comparison for most compression tests. 7zip is one of the better ones for speed and compression. But is still pretty slow. There are faster algs out there but they compromise on space but still beat zlib for size. Those are usually used for streaming. There are ones that blow 7zip away by a good 20%. They are also amazingly slow.

      DivANS is an interesting way to look at the streams spit out. Basically they are turning it into an IR directed acyclic graph like language then optimizing that much like a optimizing compiler. I would say this actually important advancement to keep an eye on in the world of compression.

(1)