Coding Horror

programming and human factors

File Compression in the Multi-Core Era

I've been playing around a bit with file compression again, as we generate some very large backup files daily on Stack Overflow.

We're using the latest 64-bit version of 7zip (4.64) on our database server. I'm not a big fan of more than dual core on the desktop, but it's a no brainer for servers. The more CPU cores the merrier! This server has two quad-core CPUs, a total of 8 cores, and I was a little disheartened to discover that neither RAR nor 7zip seemed to make much use of more than 2.

Still, even if it does only use 2 cores to compress, the 7zip algorithm is amazingly effective, and has evolved over the last few years to be respectably fast. I used to recommend RAR over Zip, but given the increased efficiency of 7zip and the fact that it's free and RAR isn't, it's the logical choice now.

Here are some quick tests I performed compressing a single 4.76 GB database backup file. This was run on a server with dual quad-core 2.5 GHz Xeon E5420 CPUs.

7zipfastest5 min14 MB/sec973 MB
7zipfast7 min11 MB/sec926 MB
7zipnormal34 min2.5 MB/sec752 MB
7zipmaximum41 min2.0 MB/sec714 MB
7zipultra48 min1.7 MB/sec698 MB

For those of you who are now wondering, wow, if 7zip does this well on maximum and ultra, imagine how it'd do on ultra-plus, don't count on it. There's a reason most compression programs default to certain settings as "normal". Above these settings, results tend to fall off a cliff; beyond that sweet spot, you tend to get absurdly tiny increases in compression ratio in exchange for huge order of magnitude increases in compression time.

Now watch what happens when I switch 7zip to use the bzip2 compression algorithm:

7zip with bzip2 selected

We'll compress that same 4.76 GB file, on the same machine:

bzip2fastest2 min36 MB/sec1092 MB
bzip2fast2.5 min29 MB/sec1011 MB
bzip2normal3.5 min22 MB/sec989 MB
bzip2maximum7 min12 MB/sec987 MB
bzip2ultra21 min4 MB/sec986 MB

Why is bzip2 able to work so much faster than 7zip? Simple:

7zip algorithm CPU usage

7zip multithreaded cpu usage

bzip2 algorithm CPU usage

bzip2-multithreaded-cpu-usage.png

Bzip2 uses more than 2 CPU cores to parallelize its work. I'm not sure what the limit is, but the drop-down selector in the 7zip GUI allows up to 16 when the bzip2 algorithm is chosen. I used 8 for the above tests, since that's how many CPU cores we have on the server.

Unfortunately, bzip2's increased speed is sort of moot at high compression levels. The difference between normal, maximum, and ultra compression is a meaningless 0.06 percent. It scales beautifully in time, but hardly at all in space. That's a shame, because that's exactly where you'd like to spend the speed increase of paralellization. Eking out a percent of size improvement could still make sense, depending on the circumstances:

total time = compression time + n * (compressed file size / network speed + decompression time)

For instance, if you compress a file to send it over a network once, n equals one and compression time will have a big influence. If you want to post a file to be downloaded many times, n is big so long compression times will weigh less in the final decision. Finally, slow networks will do best with a slow but efficient algorithm, while for fast networks a speedy, possibly less efficient algorithm is needed.

On the other hand, the ability to compress a 5 GB source file to a fifth of its size in two minutes flat is pretty darn impressive. Still, I can't help wondering how fast the 7zip algorithm would be if it was rewritten and parallelized to take advantage of more than 2 CPU cores, too.

Written by Jeff Atwood

Indoor enthusiast. Co-founder of Stack Exchange and Discourse. Disclaimer: I have no idea what I'm talking about. Find me here: http://twitter.com/codinghorror