World's most popular travel blog for travel bloggers.

Why are these (lossless) compression methods of many similar png images ineffective?

, , No Comments

Answered By : Raphael

Have a look at how compression algorithms work. At least those in the Lempel-Ziv family (gzip uses LZ77, zip apparently mostly does as well, and xz uses LZMA) compress somewhat locally: Similarities that lie far away from each other can not be identified.

The details differ between the methods, but the bottom line is that by the time the algorithm reaches the second image, it has already "forgotten" the beginning of the first. And so on.

You can try and manually change the parameters of the compression method; if window size (LZ77) resp. block/chunk size (later methods) are at least as large as two images, you will probably see further compression.


Note that the above only really applies if you have identical images or almost identical uncompressed images. If there are differences, compressed images may not look anything alike in memory. I don't know how the PNG compression works; you may want to check the hex representations of the images you have for shared substrings manually.

Also note that even with changed parameters and redundancy to exploit, you won't get down to the size of one image. Larger dictionaries mean larger code-word size, and even if two images are exactly identical you may have to encode the second one using multiple code-words (which point into the first).

Problem Detail: 

I just came across the following thing: I put multiple identical copies of a png image into a folder and then tried to compress that folder with the following methods:

  • tar czf folder.tar.gz folder/
  • tar cf folder.tar folder/ && xz --stdout folder.tar > folder.tar.xz (this one works well for identical images, however for similar images the gain is zero)
  • zip -r folder.zip folder/

When I checked the size of the .tar.gz, .tar.xz, .zip I realized that it is almost the same as the one of folder/.
I understand that a png image itself may have a high level of compression and therefore cannot be compressed further. However when merging many similar (in this case even identical) png images to an archive and then compressing the archive I would expect the required size to decrease markedly. In the case of identical images I would expect a size of roughly the size of a single image.

Asked By : a_guest
Best Answer from StackOverflow

Question Source : http://cs.stackexchange.com/questions/60588

0 comments:

Post a Comment

Let us know your responses and feedback