Coding Horror

programming and human factors

Profitable Until Deemed Illegal

I was fascinated to discover the auction hybrid site swoopo.com (previously known as telebid.com). It's a strange combination of eBay, woot, and slot machine. Here's how it works:

  • You purchase bids in pre-packaged blocks of at least 30. Each bid costs you 75 cents, with no volume discount.
  • Each bid raises the purchase price by 15 cents and increases the auction time by 15 seconds.
  • Once the auction ends, you pay the final price.

I just watched an 8GB Apple iPod Touch sell on swoopo for $187.65. The final price means a total of 1,251 bids were placed for this item, costing bidders a grand total of $938.25.

So that $229 item ultimately sold for $1,125.90.

But that one final bidder got a great deal, right? Maybe. Even when you win, you can lose. Remember that each bid costs you 75 cents, while only increasing the price of the item 15 cents. If you bid too many times on an item -- or if you use the site's "helpful" automated BidButler service, which bids on your behalf -- you'll end up paying the purchase price in bids alone. For this item, if you bid more than 305 times, you've paid the purchase price -- and only raised the cost of the item by $45.75 total.

OK, so bidding a lot is a bad idea, so maybe we only bid one time, or a few times, and near the end of the auction? Great plan, except the auction is extended 15 seconds each and every time someone bids in those final seconds. There are absolute end dates for the auctions, but they're usually so far in the future that the auction will end through attrition long before they reach their end date. I've often wondered if eBay would implement this feature, as it would effectively end last second sniping, a huge problem for auction sites. Well, beyond the obvious problem with auctions, which is that the most optimistic person sets the price for everyone else.

There's something else at work here, though, and it's almost an exploit of human nature itself. Once you've bid on something a few times, you now have a vested financial interest in that product, a product someone else could end up winning, rendering your investment moot. This often leads to irrational decisionmaking -- something called the endowment effect, which has even been observed in chimpanzees. So instead of doing the rational thing and walking away from a bad investment, you pour more money in, sending good money after bad.

It's pretty clear to me that swoopo isn't an auction site. It bills itself as "entertainment shopping". I think it is in fact a lottery; the only way to win here is sheer dumb luck.

Or, of course, by not playing at all.

wargames-quote-not-to-play.jpg

But wait -- it gets worse! Swoopo also offers

  • Penny auctions, where each bid only increases the price of the final item by 1 cent, while still costing you 75 cents.
  • FreeBids auctions, where the item up for grabs is Swoopo bids. Near as I can tell, this is swoopo printing their own money.
  • 100% off auctions, where the "winner" (and I use this term loosely) pays nothing for the final item, regardless of what the final price is bid up to. Imagine the bidding frenzy on this one at 75 cents a pop.
  • Cash auctions, where you win actual real money at the end. It's like they're not even trying to pretend they don't run a gambling site with these.

It's not clear that Swoopo even has the items they auction; they appear to sell first, then use the money they gain from the completed auction to buy and ship the item. Furthermore, they have a clause in their Help under Delivery and Shipping that lets them ship "equivalent" items:

On rare occasions we are no longer able to source the specific item detailed in the auction. When this happens, we will contact you and offer to send you an equivalent item of at least equal value. Many of the products we sell are high-technology items that have a short life-cycle, so often this will mean an upgrade to the newer version of the item.

There are also rumblings that swoopo silently pits users from the different territory websites against each other in individual auctions, such that UK users are unwittingly bidding against US users. This is done to ensure that there is around the clock bidding to extend auction end dates as long as possible.

In short, swoopo is about as close to pure, distilled evil in a business plan as I've ever seen. They get paid for everything up front, and as they drop ship everything there's no inventory or overhead to worry about. It is almost brilliantly evil, in a sort of evil genius way. You can't stop people from endowment effect fueled bidding when they have the individual chance, however small it may be, to win a $2,000 television for $80 -- while collectively sending the house $10,000 or more.

My admiration stops short of sites that prey on the weak and the uneducated -- and of business plans that are almost certainly illegal, at least here in the US.

As always, caveat emptor.

Discussion

My Scaling Hero

Inspiration for Stack Overflow occasionally comes from the unlikeliest places. Have you ever heard of the dating website, Plenty of Fish?

Markus Frind built the Plenty of Fish Web site in 2003 as nothing more than an exercise to help teach himself a new programming language, ASP.NET. The site first became popular among English-speaking Canadians. Popularity among online daters in many United States cities followed more recently, and with minimal spending on advertising the site. According to data from comScore Media Metrix for November 2007, Plenty of Fish had 1.4 million unique visitors in the United States. In December, Mr. Frind said, the site served up 1.2 billion page views, and page views have soared 20 percent since Dec. 26.

The actual plentyoffish.com site design, although it has improved (believe it or not) since the last time I looked, is almost horrifyingly bad; it literally looks like a high school student's first website programming attempt. But it doesn't matter. The site is a resounding success with users, to the point that it is almost completely user-run:

No one heads to Plenty of Fish for the customer service, which is all but nonexistent. The company does not need a support structure to handle members' subscription and billing issues because the service is entirely advertising-based. Its tagline is: "100 percent free. Put away your credit card." For hand-holding, users must rely on fellow members, whose advice is found in online forums. The Dating & Love Advice category lists more than 320,000 posts, making up in sheer quantity what it lacks in a soothing live presence available by phone.

Granted, comparing a dating site to other online properties is kind of unfair. As I mentioned in an earlier post, the most sustainible and enduring business models either get you laid, or get you paid -- and the more directly the better. Jamie Zawinski's classic Groupware Bad article covers the same ground:

So I said, narrow the focus. Your "use case" should be, there's a 22 year old college student living in the dorms. How will this software get him laid?

It's pretty clear which axis of human needs Plenty of Fish tends to. It's already working with way more cheese than most software developers will ever have.

OK, so Markus Frind singlehandedly built a massively popular free dating site that is almost entirely community run. Big deal. But what makes it especially incredible is that he does it all on a handful of servers:

  • 1.2 billion page views per month, 500,000 average unique logins per day
  • 30+ million hits per day, 500-600 per second
  • 45 million visitors per month
  • top 30 site in the US, top 10 in Canada, top 30 in the UK
  • 2 load balanced Windows Server 2003 x64 web servers with 2 Quad Core 2.66Ghz CPUs, 8 GB RAM, 2 hard drives
  • 3 database servers. No data on their configuration
  • Approaching 64,000 simultaneous connections and 2 million page views per hour
  • Internet connection is a 1 Gbps line, 200 Mbps is used
  • 1 TB per day serving 171 million images through Akamai
  • 6 TB storage array to handle millions of full sized images uploaded every month to the site

These traffic and size numbers are nothing short of astonishing. He's accomplished all this on his own, using only five servers with the same Microsoft and ASP.NET stack we use. This gives me great hope for scaling Stack Overflow without needing a lot of employees or server hardware. I'm not sure we'll ever reach those kinds of traffic levels.

That said, there are some dark clouds on the horizon; in a recent blog post, Markus noted that their free business model doesn't always scale as well as the hardware:

The problem with free is that every time you double the size of your database the cost of maintaining the site grows 6 fold. I really underestimated how much resources it would take, I have one database table now that exceeds 3 billion records. The bigger you get as a free site the less money you make per visit and the more it costs to service a visit.

Of course, any resemblance between a free dating site and a question-and-answer site for programmers is purely coincidental, I'm sure.

In the early years of programming, a program was regarded as the private property of the programmer. One would no more think of reading a colleague's program unbidden than of picking up a love letter and reading it. This is essentially what a program was, a love letter from the programmer to the hardware, full of the intimate details known only to partners in an affair. Consequently, programs became larded with the pet names and verbal shorthand so popular with lovers who live in the blissful abstraction that assumes that theirs is the only existence in the universe. Such programs are unintelligible to those outside the partnership.

Maybe Stack Overflow is also built on love, internet style. Here's hoping that scales as well as Plenty of Fish has.

Update: Markus notes that according to hitwise, as of 2008, he runs the #13 website in the United States.

Discussion

Our Hacker Odyssey

Although I've never been more than a bush league hacker (at best), I was always fascinated with the tales from the infamous hacker zine 2600. I'd occasionally discover scanned issues in BBS ASCII archives, like this one, and spend hours puzzling over the techniques and information it contained.

I was excited to learn that a 2600 compilation was released earlier this year: The Best of 2600: A Hacker Odyssey. Although a lot of the information is hopelessly out of date and/or obsolete now, there's a timeless quality to the social engineering techniques, and at its core, the best articles are just plain good storytelling combined with technical writing skills.

The introduction captures, I think, the essence of 2600 – the adventures of young adults experimenting with computers.

One of the true joys of the hacker world is the wealth of firsthand accounts that get shared throughout the community. Everyone has a story and many hackers have a whole treasure trove of them. This is what comes from being an inquisitive bunch with a tendency to probe and explore, all the while asking entirely too many questions. The rest of the world simply wasn't prepared for this sort of thing, a fact that hackers used to their advantage time and again.

The Best of 2600: a Hacker Odyssey

In the hacker world, you can have adventures and obtain information on a whole variety of levels, using such methods as social engineering, trashing, or simply communicating and meeting up with each other. All of these methods continue to work to this day. Back in the 1980s, excitement via a keyboard was a fairly new concept but it was catching on pretty fast as personal computers started to become commonplace. It seemed incredible (and still does to me) that you could simply stick your telephone into an acoustic modem, type a few letters on a keyboard, and somehow be communicating with someone in an entirely different part of the country or even another part of the globe.

Of course, hackers had already been having all sorts of adventures on telephones for years before this, whether it was through boxing, teleconferencing, or just randomly calling people. And there were the occasional "real-life" adventures, something hackers were certainly not averse to, contrary to the usual stereotypes of pasty-faced teenagers who feared going outside and interacting with the world. The point is that whenever you got a bunch of bored, curious, and daring individuals together, it didn't really matter what the setting was. On the screen, over the phone, or in real life, there was fun to be had and plenty to be learned in the process.

The mighty 2600 empire soldiers on, of course – the latest issue is Autumn 2008. This handpicked best of collection works as both historical archive and introduction. It's a great starting point, and a book I continue to take with me on trips for background reading. It rarely disappoints.

If you believe, like I do, in the value of learning through cartoons, then Ed Piskor's Wizzywig graphic novels are excellent companion pieces to the 2600 compilation.

Wizzywig #2: Hacker, page 10 panel

So far there's Wizzywig Volume 1: Phreak and Wizzywig Volume 2: Hacker and Wizzywig Volume 3: Fugitive, with a fourth and final book on the way. You can read the first two books completely free online; if you like what you see, Ed sells all the books directly on his store. It's a little eerie how accurately he captured the ambiance of that era for me, all those fumbling, exploratory sessions with nascent online community through modems, local bulletin boards, and user group meetings.

It's fun to revisit the origins of my hacker odyssey, but I feel like we're nowhere near the end of it yet.

How about you?

Discussion

Blu-Ray: Is It Time?

I've been monitoring the progress of high-definition video playback on the PC for quite a while now:

It's been almost two years since I wrote that series, and I think we're dangerously close to viable high definition video playback on typical, mainstream PCs. One metric I follow closely is the price of the hardware, and OEM Blu-Ray drives are now only $99 shipped.

LG GGC-H20LK Blu-Ray HD-DVD combo drive

This drive is a DVD burner, in addition to playing HD-DVDs, Blu-Ray, and obviously DVDs -- and it also has very positive customer reviews. I couldn't resist, so I bought one.

I have no need for a standalone Blu-Ray player, but a cursory look tells me those are down to around $250 for decent models. And then of course there's always the PlayStation 3 option.

It's a shame OS X and Vista don't natively support HD playback of any kind (although Vista does include some copy protection mechanisms specific to high-definition video playback, which was the source of great hue and outrage). When you pair this $99 drive with some third party playback software like PowerDVD HD or WinDVD HD, you're set.

I'm particularly interested in high definition PC playback because the home theater PC I recently built is more than capable:

Also, I finally own a true 1920 x 1080 HDTV now -- yes, you can all stop making fun of me for using a creaky old brass and steam powered 852 x 480 EDTV -- so all the pieces are now in place for me to adopt Blu-Ray. I switched my Netflix account over to Blu-Ray this morning.

I'm not quite a high definition video early adopter, but I'm still on the leading edge of the curve. Funny how technology cycles repeat themselves. I distinctly recall being an early adopter of DVDs back in 1998, almost exactly 10 years ago. The 720 x 480 resolution and Dolby Digital sound seemed so impressive back then. I remember marvelling at the fancy interactive menus on the Austin Powers DVD. Of course, DVD quality is pretty pedestrian by today's standards. We've almost gotten to the point where DVD-level video quality is available worldwide in a typical web browser, not necessarily through YouTube, but through Vimeo and other alternatives.

With that in mind, I wonder how quaint Blu-Ray will seem in 2018?

Discussion

The Problem With Logging

A recent Stack Overflow post described one programmer's logging style. Here's what he logs:

INFO Level

  • The start and end of the method
  • The start and end of any major loops
  • The start of any major case/switch statements

DEBUG Level

  • Any parameters passed into the method
  • Any row counts from result sets I retrieve
  • Any datarows that may contain suspicious data when being passed down to the method
  • Any "generated" file paths, connection strings, or other values that could get mungled up when being "pieced together" by the environment.

ERROR Level

  • Handled exceptions
  • Invalid login attempts (if security is an issue)
  • Bad data that I have intercepted forreporting

FATAL Level

  • Unhandled exceptions.

I don't mean to single out the author here, but this strikes me as a bit .. excessive.

Although I've never been a particularly big logger, myself, one of my teammates on Stack Overflow is. So when building Stack Overflow, we included log4net, and logged a bunch of information at the various levels. I wasn't necessarily a big fan of the approach, but I figured what's the harm.

Logging does have a certain seductive charm. Why not log as much as you can whenever you can? Even if you're not planning to use it today, who knows, it might be useful for troubleshooting tomorrow. Heck, just log everything! What could it possibly hurt?

Oh, sure, logging seems harmless enough, but let me tell you, it can deal some serious hurt. We ran into a particularly nasty recursive logging bug:

  • On thread #1, our code was doing Log (lock) / DB stuff (lock)
  • On thread #2, our code was doing DB stuff (lock) / log stuff (lock)

If these things happened close together enough under heavy load, this resulted in -- you guessed it -- a classic out-of-order deadlock scenario. I'm not sure you'd ever see it on a lightly loaded app, but on our website it happened about once a day on average.

I don't blame log4net for this, I blame our crappy code. We spent days troubleshooting these deadlocks by .. wait for it .. adding more logging! Which naturally made the problem worse and even harder to figure out. We eventually were forced to take memory dumps and use dump analysis tools. With the generous assistance of Greg Varveris, we were finally able to identify the culprit: our logging strategy. How ironic. And I mean real irony, not the fake Alanis Morrissette kind.

Although I am a strong believer in logging exceptions, I've never been a particularly big fan of logging in the general "let's log everything we possibly can" sense:

  1. Logging means more code. If you're using a traditional logging framework like log4net, every logged event is at least one additional line of code. The more you log, the larger your code grows. This is a serious problem, because code is the enemy. Visible logging code is clutter -- like excessive comments, it actively obscures the code that's doing the real work in the application.

  2. Logging isn't free. Most logging frameworks are fairly efficient, but they aren't infinitely fast. Every log row you write to disk has an overall performance cost on your application. This can also be tricky if you're dissecting complex objects to place them in the log; that takes additional time.

  3. If it's worth saving to a logfile, it's worth showing in the user interface. This is the paradox: if the information you're logging is at all valuable, it deserves to be surfaced in the application itself, not buried in an anonymous logfile somewhere. Even if it's just for administrators. Logfiles are all too often where useful data goes to die, alone, unloved and ignored.

  4. The more you log, the less you can find. Log enough things and eventually your logs are so noisy nobody can find anything. It's all too easy to bury yourself in an avalanche of log data. Heck, that's the default: any given computer is perfectly capable of generating more log data than any of us could possibly deal with in our lifetime. The hidden expense here isn't the logging, it's the brainpower needed to make sense of these giant logs. I don't care how awesome your log parsing tools are, nobody looks forward to mining a gigabyte of log files for useful diagnostic information.

  5. The logfile that cried Wolf. Good luck getting everyone on your team to agree on the exact definitions of FATAL, ERROR, DEBUG, INFO, and whatever other logging levels you have defined. If you decide to log only the most heinous serial-killer mass-murderer type problems, evil has a lot less room to lurk in your logfiles -- and it'll be a heck of a lot less boring when you do look.

So is logging a giant waste of time? I'm sure some people will read about this far and draw that conclusion, no matter what else I write. I am not anti-logging. I am anti-abusive-logging. Like any other tool in your toolkit, when used properly and appropriately, it can help you create better programs. The problem with logging isn't the logging, per se -- it's the seductive OCD "just one more bit of data in the log" trap that programmers fall into when implementing logging. Logging gets a bad name because it's so often abused. It's a shame to end up with all this extra code generating volumes and volumes of logs that aren't helping anyone.

We've since removed all logging from Stack Overflow, relying exclusively on exception logging. Honestly, I don't miss it at all. I can't even think of a single time since then that I'd wished I'd had a giant verbose logfile to help me diagnose a problem.

When it comes to logging, the right answer is not "yes, always, and as much as possible." Resist the tendency to log everything. Start small and simple, logging only the most obvious and critical of errors. Add (or ideally, inject) more logging only as demonstrated by specific, verifiable needs.

If you aren't careful, those individual log entries, as wafer thin as they might be, have a disturbing tendency to make your logs end up like the unfortunate Mr. Creosote.

Discussion