Coding Horror

programming and human factors

Net.WebClient and Deflate

In a previous entry, Net.WebClient and Gzip, I posted a code snippet that enables the missing HTTP compression in Net.WebClient, using the always handy SharpZipLib.

This code eventually made it into one of my CodeProject articles. An eagle-eyed CodeProject reader noted that, while my code worked for gzip compression, it failed miserably for websites that use deflate compression. This is case of be careful what you ask for:

        Dim wc As New Net.WebClient
'-- google will not gzip the content if the User-Agent header is missing!
wc.Headers.Add("User-Agent", strHttpUserAgent)
wc.Headers.Add("Accept-Encoding", "gzip,deflate")
'-- download the target URL into a byte array
Dim b() As Byte = wc.DownloadData(strUrl)

99% of the time, you'll get a gzipped array of bytes back from that request. For whatever reason, deflate compression is extremely rare on the open internet. The same reader also helpfully provided a URL that uses deflate: Redline Networks. So that was my test case. Although SharpZipLib supports deflate compression, I had difficulty getting this to work using provided the inflater stream class. And since it's such a rare case, I couldn't find any working code samples.

In desperation-- my OCD prohibits me from letting that last 1% case go-- I turned to the only relevant google result I could find, which happens to be on the SharpZipLib community forum. Jfreilly quickly provided an answer within a day! Problem solved. He also maintains a very nice SharpZip Library FAQ. Kudos to you, sir.

    ''' <summary>
''' decompresses a compressed array of bytes
''' via the specified HTTP compression type
''' </summary>
Private Function Decompress(ByVal b() As Byte, _
ByVal CompressionType As HttpContentEncoding) As Byte()
Dim s As Stream
Select Case CompressionType
Case HttpContentEncoding.Deflate
s = New Zip.Compression.Streams.InflaterInputStream( _
New MemoryStream(b), _
New Zip.Compression.Inflater(True))
Case HttpContentEncoding.Gzip
s = New GZip.GZipInputStream(New MemoryStream(b))
Case Else
Return b
End Select
Dim ms As New MemoryStream
Const intChunkSize As Integer = 2048
Dim intSizeRead As Integer
Dim unzipBytes(intChunkSize) As Byte
While True
intSizeRead = s.Read(unzipBytes, 0, intChunkSize)
If intSizeRead > 0 Then
ms.Write(unzipBytes, 0, intSizeRead)
Else
Exit While
End If
End While
s.Close()
Return ms.ToArray
End Function

There is also a mysterious, third kind of HTTP compression, compress. Ok, it's not all that mysterious, but nobody seems to use it. What's up with that?

Written by Jeff Atwood

Indoor enthusiast. Co-founder of Stack Exchange and Discourse. Disclaimer: I have no idea what I'm talking about. Find me here: http://twitter.com/codinghorror