Coding Horror

programming and human factors

The Popularity Tax

I'm sure everyone reading this is familar with the slashdot effect:

When Slashdot links a site, often a lot of readers will hit the link to read the story or see the purty pictures. This can easily throw thousands of hits at the site in minutes. Most of the time, large professional websites have no problem with this, but often a site we link will be a smaller site, used to getting only a few thousand hits a day. When all those Slashdot readers start crashing the party, it can saturate the site completely, causing the site to buckle under the strain. When this happens, the site is said to be "Slashdotted." Recently, the terms "Slashdot Effect" and "Slashdotted" have been used more generally to refer to any short-term traffic jam at a website.
The slashdot effect is an interesting illustration of what I call the popularity tax: the more popular the content is, the more it costs the content owner. This is unique to the world of bits (eg, the internet). In the world of atoms (eg, the real world), it's the exact opposite: mere popularity doesn't cost the content owner a dime. If you have something people want, they have to pay to physically get to it, one way or another. Compare a book or a movie to, say, that Bush/Kerry flash movie. Who is footing the massive bandwidth bill for the tens of thousands of people downloading that file?

At the very root of this discussion, is, of course, the cost of bandwidth. Computer hardware may get cheaper every day and free open source software seems to be flourishing, but bandwidth shows no sign of following the same trends. There's an interesting Microsoft Research paper on this topic: Jim Gray's Distributed Computing Economics.

Computing economics are changing. Today there is rough price parity between (1) one database access, (2) ten bytes of network traffic, (3) 100,000 instructions, (4) 10 bytes of disk storage, and (5) a megabyte of disk bandwidth. This has implications for how one structures Internet-scale distributed computing: one puts computing as close to the data as possible in order to avoid expensive network traffic.
In other words, bandwidth is hellaciously expensive relative to almost anything else, other than perhaps labor costs (and thus software?)
From this we conclude that one dollar equates to:



$1=
1 GB sent over the WAN
10 Tops tera-CPU instructions
8 hours of cpu time
1 GB disk space
10M database accesses
10TB of disk bandwidth


So, we have this weird situation on the internet where popularity is a curse, instead of the blessing it should be. This is a terrible state of affairs, a major disincentive to build anything that a lot of people will consume. Right now, unless you are leveraging your popularity to sell something, you are generating a substantial net loss through bandwidth costs. This has always been my argument for micropayments, which lets the popularity equal income, but there are a lot of heavyweight barriers to this ever happening, such as the current banking industry's inability to deal with any volume of tiny transactions.

Micropayments is one answer, but micropayments have problems of their own. I don't really have any good answers to the bandwidth cost issue. It is a serious, complex problem with no easy resolution. In a worst case scenario, highly asymmetric upload/download ratios can be an indirect way for big corporations to enforce copyright restrictions on consumers. The typical cable modem can download data at 300kb/sec but can only upload at a measly 30kb/sec (or less!). Can't shut down Kazaa or Napster through legislation? The next best thing is to remove people's ability to upload, thus crippling p2p at its very source. This creates a digital ghetto: 99% of users are locked into the role of ravenous consumers, with massive download capacities but a mere trickle of upload.

Written by Jeff Atwood

Indoor enthusiast. Co-founder of Stack Exchange and Discourse. Disclaimer: I have no idea what I'm talking about. Find me here: http://twitter.com/codinghorror