Home > Uncategorized > Compression program comparison

Compression program comparison

Out of curiosity, I tested some compression programs (bzip2, gzip, lzma, rzip, zip) for speed and compression ratio, using Linux kernel source as input. The resulting data was turned into some interesting graphs with gnuplot.

All of the programs support compression levels 1-9, and each was tested respectively. Click the images for full size graphs.

Gzip graph

Graph: gzip compression

Default compression level indeed seems to be the “most optimal” one.

Bzip2 graph

Graph: bzip2 compression

The bzip2 graph for this input is quite peculiar, in particular the compression levels 5-7.

Combined graph for all programs

This graph shows the time spent and resulting size, for each program. Note that the x-axis is logarithmic (lzma took a lot of time for the highest compression levels).

The interesting ones seem to be lzma and rzip (at lowest levels). Rzip may seem better than bzip2, but it uses considerably more memory than bzip2 (block size for bzip2 -9 is 900 KiB): “A typical compression run on a large file might use a couple of hundred MB of ram.” [1]

The file compressed was a tar of Linux kernel 2.6.0, 250316800 bytes (239 MiB). The benchmarks were run on a box with “AMD Athlon(tm) 64 Processor 2800+” cpu and 2 GiB of memory. Each compression program and level was run six times, and the average was calculated of the last five. The script that was used to run the tests is available in the “scripts” project, “benchmark-compression”.

Program versions:

  • bzip2 1.0.3
  • gzip 1.3.9
  • lzma 4.43
  • rzip 2.1
  • zip 2.32

Future ideas?

It would be nice to have some data of memory usage (any ideas on how to do this?). Using various input file types would be a good idea as well (binaries, images, movies etc).

[1] http://rzip.samba.org/

Uncategorized

  1. No comments yet.
  1. No trackbacks yet.