Blocking Image Bandwidth Theft with URL Rewriting

I like to periodically watch the HTTP traffic on my server. I can see what I'm actually serving up over the wire, and how much bandwidth I'm using.

That's how I noticed that I've become somewhat popular with direct-link image bandwidth thieves. In other words, people who thoughtlessly (or maliciously) embed these IMG links in their web page:

<img src="http://www.codinghorror.com/blog/images/qbert_regex_16.png">

That means the image qbert_regex_16.png is served by my webserver to every user who happens to request this myspace profile page.

Warning: like all myspace pages, that page is

  • Not really safe for work
  • Incredibly, mind-bendingly ugly
  • Filled with thousands of images, animated images, flash, MIDI samples, embedded MP3s
  • Utterly and completely incomprehensible

In short, a trainwreck. Every time I visit myspace, I feel a little bit stupider, ala Billy Madison:

Principal: Mr. Madison, what you've just said is one of the most insanely idiotic things I have ever heard. At no point in your rambling, incoherent response were you even close to anything that could be considered a rational thought. Everyone in this room is now dumber for having listened to it. I award you no points, and may God have mercy on your soul.

Billy Madison: Okay, a simple no would've done just fine.

I have no idea why myspace is so popular. I guess the best I can hope for is that those damn kids stay off my lawn.

Anyway, back to business. The most common technique for blocking direct image links is to check the HTTP referer header. Here's the complete HTTP header set of an image request that just came through:

GET /blog/images/logitech_g15_keyboard.jpg HTTP/1.1
Accept: */*
Referer: http://www2.gamelux.nl/forum/topics/10072/38/
Accept-Language: nl
Accept-Encoding: gzip, deflate
User-Agent: Mozilla/4.0 (compatible; MSIE 6.0; Windows NT 5.1; SV1)
Host: www.codinghorror.com
Connection: Keep-Alive

Prior to serving up the image, we should check the Referer HTTP header, and make sure it's either:

  1. Blank
  2. In a list of known whitelisted referring domains

If it isn't, we will serve up either a 404 error, or a "hey, stop stealing our bandwidth" image of some kind. Because I'm a nice guy, I chose this image:

All this can be done through incredibly powerful URL Rewriting, which has been standard on Apache for some time. There's a nice walkthrough on how to set up image link blocking in Apache on Tom Sherman's site.

Unfortunately, IIS 6 doesn't have native support for URL Rewriting*, but there are any number of third party ISAPI filters that can do it. The one I use is ISAPI Rewrite. It's very similar to the Apache version, in that it is driven by the httpd.ini file in the root of each website. I struggled a bit with the rules, but thanks to a helpful forum post, I realized that I needed to put all the whitelisted domains on a single line to get a boolean "or" that included the empty referer case, like so:

[ISAPI_Rewrite]

# Block external image linking RewriteCond Referer: (?!http://(?:www.codinghorror.com|www.bloglines.com|www.google.com)).+ RewriteRule .*.(?:gif|jpg|png) /images/block.gif [I,O]

So, as outlined above: unless the referer is blank, or in the whitelist, they get shunted to the blocked image.**

Take that, 26 zillion myspace users.

* I'm pretty sure URL Rewriting will be in IIS7, since they're finally getting around to making a really good copy of Apache's modular architecture in version 7.

** This is done at the ISAPI level, so unlike the cheesy ASP.NET "URL rewriting" solutions, it also works on generic URLs, not just URLs that end in .aspx or some other extension that is sent to the ASP.NET handler. This has long been a pet peeve of mine, but it's really the fault of IIS. And it's changing in IIS 7.

Read more

Stay Gold, America

We are at an unprecedented point in American history, and I'm concerned we may lose sight of the American Dream.

By Jeff Atwood · · Comments

The Great Filter Comes For Us All

With a 13 billion year head start on evolution, why haven't any other forms of life in the universe contacted us by now? (Arrival is a fantastic movie. Watch it, but don't stop there - read the Story of Your Life novella it was based on

By Jeff Atwood · · Comments

I Fight For The Users

If you haven't been able to keep up with my blistering pace of one blog post per year, I don't blame you. There's a lot going on right now. It's a busy time. But let's pause and take a moment

By Jeff Atwood · · Comments

The 2030 Self-Driving Car Bet

It's my honor to announce that John Carmack and I have initiated a friendly bet of $10,000* to the 501(c)(3) charity of the winner’s choice: By January 1st, 2030, completely autonomous self-driving cars meeting SAE J3016 level 5 will be commercially available for passenger

By Jeff Atwood · · Comments