Blocking Image Bandwidth Theft with URL Rewriting

I like to periodically watch the HTTP traffic on my server. I can see what I'm actually serving up over the wire, and how much bandwidth I'm using.

That's how I noticed that I've become somewhat popular with direct-link image bandwidth thieves. In other words, people who thoughtlessly (or maliciously) embed these IMG links in their web page:

<img src="http://www.codinghorror.com/blog/images/qbert_regex_16.png">

That means the image qbert_regex_16.png is served by my webserver to every user who happens to request this myspace profile page.

Warning: like all myspace pages, that page is

  • Not really safe for work
  • Incredibly, mind-bendingly ugly
  • Filled with thousands of images, animated images, flash, MIDI samples, embedded MP3s
  • Utterly and completely incomprehensible

In short, a trainwreck. Every time I visit myspace, I feel a little bit stupider, ala Billy Madison:

Principal: Mr. Madison, what you've just said is one of the most insanely idiotic things I have ever heard. At no point in your rambling, incoherent response were you even close to anything that could be considered a rational thought. Everyone in this room is now dumber for having listened to it. I award you no points, and may God have mercy on your soul.

Billy Madison: Okay, a simple no would've done just fine.

I have no idea why myspace is so popular. I guess the best I can hope for is that those damn kids stay off my lawn.

Anyway, back to business. The most common technique for blocking direct image links is to check the HTTP referer header. Here's the complete HTTP header set of an image request that just came through:

GET /blog/images/logitech_g15_keyboard.jpg HTTP/1.1
Accept: */*
Referer: http://www2.gamelux.nl/forum/topics/10072/38/
Accept-Language: nl
Accept-Encoding: gzip, deflate
User-Agent: Mozilla/4.0 (compatible; MSIE 6.0; Windows NT 5.1; SV1)
Host: www.codinghorror.com
Connection: Keep-Alive

Prior to serving up the image, we should check the Referer HTTP header, and make sure it's either:

  1. Blank
  2. In a list of known whitelisted referring domains

If it isn't, we will serve up either a 404 error, or a "hey, stop stealing our bandwidth" image of some kind. Because I'm a nice guy, I chose this image:

All this can be done through incredibly powerful URL Rewriting, which has been standard on Apache for some time. There's a nice walkthrough on how to set up image link blocking in Apache on Tom Sherman's site.

Unfortunately, IIS 6 doesn't have native support for URL Rewriting*, but there are any number of third party ISAPI filters that can do it. The one I use is ISAPI Rewrite. It's very similar to the Apache version, in that it is driven by the httpd.ini file in the root of each website. I struggled a bit with the rules, but thanks to a helpful forum post, I realized that I needed to put all the whitelisted domains on a single line to get a boolean "or" that included the empty referer case, like so:

[ISAPI_Rewrite]

# Block external image linking RewriteCond Referer: (?!http://(?:www.codinghorror.com|www.bloglines.com|www.google.com)).+ RewriteRule .*.(?:gif|jpg|png) /images/block.gif [I,O]

So, as outlined above: unless the referer is blank, or in the whitelist, they get shunted to the blocked image.**

Take that, 26 zillion myspace users.

* I'm pretty sure URL Rewriting will be in IIS7, since they're finally getting around to making a really good copy of Apache's modular architecture in version 7.

** This is done at the ISAPI level, so unlike the cheesy ASP.NET "URL rewriting" solutions, it also works on generic URLs, not just URLs that end in .aspx or some other extension that is sent to the ASP.NET handler. This has long been a pet peeve of mine, but it's really the fault of IIS. And it's changing in IIS 7.

Related posts

There is no longer any such thing as Computer Security

There is no longer any such thing as Computer Security

Remember “cybersecurity”? Mysterious hooded computer guys doing mysterious hooded computer guy... things! Who knows what kind of naughty digital mischief they might be up to? Unfortunately, we now live in a world where this kind of digital mischief is literally rewriting the world’s history. For proof of that, you

By Jeff Atwood ·
Comments
Hacker, Hack Thyself

Hacker, Hack Thyself

We’ve read so many sad stories about communities that were fatally compromised or destroyed due to security exploits. We took that lesson to heart when we founded the Discourse project; we endeavor to build open source software that is secure and safe for communities by default, even if there

By Jeff Atwood ·
Comments

Let's Encrypt Everything

I'll admit I was late [https://blog.codinghorror.com/should-all-web-traffic-be-encrypted/] to the HTTPS party. [https://letsencrypt.org] But post Snowden, and particularly after the result of the last election here in the US, it's clear that everything on the web should be encrypted by default. Why?

By Jeff Atwood ·
Comments

Welcome to The Internet of Compromised Things

This post is a bit of a public service announcement, so I'll get right to the point: > Every time you use WiFi, ask yourself: could I be connecting to the Internet through a compromised router with malware? It's becoming more and more common to see

By Jeff Atwood ·
Comments

Recent Posts

Stay Gold, America

Stay Gold, America

We are at an unprecedented point in American history, and I'm concerned we may lose sight of the American Dream.

By Jeff Atwood ·
Comments
The Great Filter Comes For Us All

The Great Filter Comes For Us All

With a 13 billion year head start on evolution, why haven’t any other forms of life in the universe contacted us by now? (Arrival is a fantastic movie. Watch it, but don’t stop there – read the Story of Your Life novella it was based on for so much

By Jeff Atwood ·
Comments
I Fight For The Users

I Fight For The Users

If you haven’t been able to keep up with my blistering pace of one blog post per year, I don’t blame you. There’s a lot going on right now. It’s a busy time. But let’s pause and take a moment to celebrate that Elon Musk

By Jeff Atwood ·
Comments

The 2030 Self-Driving Car Bet

It’s my honor to announce that John Carmack and I have initiated a friendly bet of $10,000* to the 501(c)(3) charity of the winner’s choice: By January 1st, 2030, completely autonomous self-driving cars meeting SAE J3016 level 5 will be commercially available for passenger use

By Jeff Atwood ·
Comments