Coding Horror

programming and human factors

The Girl Who Proved P = NP

One of my all time favorite blog entries is a truly epic tale of dating gone wrong that culminates in the strangest reference to P=NP you'll probably ever encounter.

Joey: So you really did graduate from computer engineering?

New Girl: Yes I did, from UBC!

Joey: And you took the "Algorithms" course?

New Girl: Of course!

Joey: And you have all the papers you wrote?

New Girl: Yes! I kept them all, and I'll show them to you tomorrow!

Joey: I want to see the one we always called the "Hell Paper" at Queen's -- the mandatory fourth-year paper. You know the one, where we prove P = NP?

New Girl: I did that! I proved P = NP! I placed near the top of the class, and the professor used my paper as an example!

Joey: You proved P = NP?

New Girl: Yes!

Joey: Gotcha.

Poor Joey. Dating crazy people is one thing, but dating crazy people who claim to have proved P=NP is another matter entirely. I know, I know, my track record with P=NP is hardly any better. But at least you're not dating me, right?

NP completeness is one of the great unsolved mysteries in computer science; perhaps the best way to illustrate is through this xkcd cartoon:

np_complete.png

The defining characteristic of an NP-complete problem is that optimal solutions, using math and logic as we currently understand them, are effectively impossible. Sure, you can approximate a solution, but an optimal solution requires so many calculations as to be infeasible, even with computers that operated at, say .. the speed of light.

In fact, one of the outstanding problems in computer science is determining whether questions exist whose answer can be quickly checked, but which require an impossibly long time to solve by any direct procedure. Problems like the one listed above certainly seem to be of this kind, but so far no one has managed to prove that any of them really are so hard as they appear, i.e., that there really is no feasible way to generate an answer with the help of a computer.

It's doubtful whether anyone will ever prove that P=NP (pdf), but in the meantime it's useful to recognize problems that are NP complete:

Unfortunately, proving inherent intractibility can be just as hard as finding efficient algorithms.

The theory of NP-completeness provides many straightforward techniques for proving that a given problem is "just as hard" as a large number of other problems that are widely recognize as being difficult and that have been confounding the experts for years. Armed with these techniques, you might be able to prove that the bandersnatch problem is NP-complete and march into your boss's office and announce:

np-complete-cartoon.png

I can't find an efficient algorithm, but neither can all these famous people.

At the very least, this would inform your boss that it would do no good to fire you and hire another expert on algorithms.

Now you can spend your time looking for efficient algorithms that solve various special cases of the general problem. You might look for algorithms that, though not guaranteed to run quickly, seem likely to do so most of the time. Or you might even relax the problem somewhat, looking for a fast algorithm that merely finds designs that meet most of the component specifications. Thus, the primary application of the theory of NP-completeness is to assist algorithm designers in directing their problem-solving efforts toward those approaches that have the greatest likelihood of leading to useful algorithms.

As with so many things in programming, the first step is learning enough to know when you're really screwed.

Unfortunately for poor Joey, this sad corollary to NP-completeness apparently applies to dating, too.

Discussion

Server Fault: Calling All Lusers

It's pop quiz time! Put away your notes, and let's begin.

a) Do you own this book?*

unix-system-administration-handbook.png

b) Do you know who this man is?

mark-russinovich-sysinternals.jpg

c) Does this FAQ look familiar to you?

3) OUR LITTLE FRIEND, THE COMPUTER
3.1) Are there any OSes that don't suck?
3.2) Are there any vendors that don't suck?
3.3) How about any hardware?
3.4) Just HOW MUCH does this system suck?
3.5) Where can I find clueful tech support?
3.6) What can I do to help my computers behave?

d) Does the acronym BOFH mean anything to you?

e) Do you think this is funny?

april-fools-day-rfcs.png

If you answered "yes" to any of the above, I am sorry to inform you that you may be a system administrator or IT professional. But I do have one bit of potentially, at least theoretically good news for you:

Server Fault is now in public beta!

serverfault-logo.png

Server Fault is a sister site to Stack Overflow, which we launched back in September 2008. It uses the same engine, but it's not just for programmers any more:

Server Fault is for system administrators and IT professionals, people who manage or maintain computers in a professional capacity. If you are in charge of ...
  • servers
  • networks
  • many desktop PCs (other than your own)
... then you're in the right place to ask your question! Well, as long as the question is about your servers, your networks, or desktops you support, anyway.

Please note that Server Fault is not for general computer troubleshooting questions; if you paid for that desktop hardware, and it's your personal workstation, it is unlikely that your question is appropriate for Server Fault.

I occasionally dabble in system administration and IT professional stuff; my last blog entry was about RAID, for example. As a programmer who loves hardware as much as software, I've wanted this site for months, and I'm thrilled to see it go live, as I explained on a recent RunAs radio podcast.

Although there is certainly some crossover, we believe that the programming community and the IT/sysadmin community are different beasts. Just because you're a hotshot programmer doesn't mean you have mastered networking and server configuration. And I've met a few sysadmins who could script circles around my code. That's why Server Fault gets its own domain, user profiles, and reputation system.

userfriendly-evolution-of-language.png

So if you're a bona-fide BOFH, or just a wanna-be BOFH luser like me, join us on Server Fault. Who knows, maybe we lusers can learn something from each other.

* (For the record, yes, I do own that book -- although I am easily the world's worst UNIX system administrator.)

Discussion

Beyond RAID

I've always been leery of RAID on the desktop. But on the server, RAID is a definite must:

"RAID" is now used as an umbrella term for computer data storage schemes that can divide and replicate data among multiple hard disk drives. The different schemes/architectures are named by the word RAID followed by a number, as in RAID 0, RAID 1, etc. RAID's various designs all involve two key design goals: increased data reliability or increased input/output performance. When multiple physical disks are set up to use RAID technology, they are said to be in a RAID array. This array distributes data across multiple disks, but the array is seen by the computer user and operating system as one single disk.

I hadn't worked much at all with RAID, as I felt the benefits did not outweigh the risks on the desktop machines I usually build. But the rules are different in the datacenter; the servers I built for Stack Overflow all use various forms of RAID, from RAID 1 to RAID 6 to RAID 10. While working with these servers, I was surprised to discover there are now umpteen zillion numbered variants of RAID -- but they all appear to be based on a few basic, standard forms:

RAID 0: Striping

Data is striped across (n) drives, which improves performance almost linearly with the number of drives, but at a steep cost in fault tolerance; a failure of any single striped drive renders the entire array unreadable.

raid-0-diagram.png

RAID 1: Mirroring

Data is written across (n) drives, which offers near-perfect redundancy at a slight performance decrease when writing -- and at the cost of half your overall storage. As long as one drive in the mirror array survives, no data is lost.

raid-1-diagram.png

Raid 5: Parity

Data is written across (n) drives with a parity block. The array can tolerate one drive failure, at the cost of one drive in storage. There may be a serious performance penalty when writing (as parity and blocks are calculated), and when the array is rebuilding.

raid-5-diagram.png

Raid 6: Dual Parity

Data is written across (n) drives with two parity blocks. The array can tolerate two drive failures, at the cost of two drives in storage. There may be a serious performance penalty when writing (as parity and blocks are calculated), and when the array is rebuilding.

raid-6-diagram.png

(yes, there are other forms of RAID, but they are rarely implemented or used as far as I can tell.)

It's also possible to generate so-called RAID 10 or RAID 50 arrays by nesting these RAID levels together. If you take four hard drives, stripe the two pairs, then mirror the two striped arrays -- why, you just created yourself a magical RAID 10 concoction! What's particularly magical about RAID 10 is that it inherits the strengths of both of its parents: mirroring provides excellent redundancy, and striping provides excellent speed. Some would say that RAID 10 is so good it completely obviates any need for RAID 5, and I for one agree with them.

This was all fascinating new territory to me; I knew about RAID in theory but had never spent hands-on time with it. The above is sufficient as a primer, but I recommend reading through the wikipedia entry on RAID for more depth.

It's worth mentioning here that RAID is in no way a substitute for a sane backup regimen, but rather a way to offer improved uptime and survivability for your existing systems. Hard drives are cheap and getting cheaper every day -- why not use a whole slew of the things to get better performance and better reliability for your servers? That's always been the point of Redundant Array of Inexpensive Disks, as far as I'm concerned. I guess Sun agrees; check out this monster:

sun-x4500-top.jpg

That's right, 48 commodity SATA drives in a massive array, courtesy of the Sun Sunfire X4500. It also uses a new RAID system dubbed RAID-Z:

RAID-Z is a data/parity scheme like RAID-5, but it uses dynamic stripe width. Every block is its own RAID-Z stripe, regardless of blocksize. This means that every RAID-Z write is a full-stripe write. This, when combined with the copy-on-write transactional semantics of ZFS, completely eliminates the RAID write hole. RAID-Z is also faster than traditional RAID because it never has to do read-modify-write.

But far more important, going through the metadata means that ZFS can validate every block against its 256-bit checksum as it goes. Traditional RAID products can't do this; they simply XOR the data together blindly.

Which brings us to the coolest thing about RAID-Z: self-healing data. In addition to handling whole-disk failure, RAID-Z can also detect and correct silent data corruption. Whenever you read a RAID-Z block, ZFS compares it against its checksum. If the data disks didn't return the right answer, ZFS reads the parity and then does combinatorial reconstruction to figure out which disk returned bad data. It then repairs the damaged disk and returns good data to the application. ZFS also reports the incident through Solaris FMA so that the system administrator knows that one of the disks is silently failing.

Finally, note that RAID-Z doesn't require any special hardware. It doesn't need NVRAM for correctness, and it doesn't need write buffering for good performance. With RAID-Z, ZFS makes good on the original RAID promise: it provides fast, reliable storage using cheap, commodity disks.

Pardon the pun, but I'm not sure if it makes traditional hardware RAID redundant, necessarily. Even so, there are certainly fantastic, truly next-generation ideas in ZFS. There's a great ACM interview with the creators of ZFS that drills down into much more detail. Hard drives may be (mostly) dumb hunks of spinning rust, but it's downright amazing what you can do when you get a whole bunch of them working together.

Discussion

Penny Auctions: They're Gambling

Late last year, I encountered what may be nearly perfect evil in business plan form: Swoopo. What is Swoopo? It's a class of penny auction, where bidders pay for the privilege of bidding:

[Penny auctions] offer new televisions, computers, game consoles, appliances, handbags, gold bars and more for starting prices of a penny to 15 cents, depending on the site.

To "win" a product, shoppers must first buy a bundle of 10 to 700 bids for 60 cents to $1 each. Shoppers use one each time they place a virtual bid on a product. Each bid raises the price of the item by a penny to 15 cents, depending on the site. Some have automatic bidding functions similar to eBay.

Doing the math and not getting carried away is important: The final price of a product that retails for $100 might be $29, but the total price paid could be much more, depending upon the number of bids used. If a shopper bids 10 times at $1 a bid, for instance, the total price paid would jump to $39. And, there is the real possibility of using all your bids without getting the product.

Auction winners generally get their item for about 65 percent off retail but could save as much as 98 percent if there are few bidders.

Since the sites make the bulk of their revenue from the purchase of bids, they profit most when they feature a product that elicits a bidding war.

One of Swoopo's investors recently contacted me via email, and I had to marvel at the size of the cojones you'd need to associate yourself with this kind of nastiness. Swoopo is evil beyond the likes of Saddam Hussein, The Balrog, OSB, Darth Vader, and Barbra Streisand -- combined.

He wanted to talk to me on the phone about positioning, and staunchly maintained that there was no element of chance in a Swoopo "auction". Once I stopped laughing, I told him these were my terms:

If you believe in Swoopo, then data speaks much louder than words.

Let's conduct an experiment.

Doesn't have to be you, personally. Take n dollars, and use those n dollars in whatever strategy it takes to win items (of MSRP $399 or higher) on Swoopo.

If Swoopo isn't a game of chance or lottery, a skilled player should be able to win at least one item in this experiment, yes?

I'd be happy to run this experiment and write about it on my blog. Just let me know what terms you think make sense.

I haven't heard from him since. (Now I'm curious if anyone is willing to take on this experiment, under the same terms.)

Because Swoopo is, at its heart, thinly veiled gambling. The companies backing Swoopo and other Penny Auction sites are hoping unsophisticated regulatory agencies will buy the "It's not a game of chance" argument if it's wrapped in a lot of technical intarweb mumbo-jumbo they can't fully comprehend.

But we're no government flacks. We're programmers, and many of us develop websites for a living. It's a bit tougher to pull the wool over our eyes. In Trying to Game Swoopo, Joshua Stein pulled out everything in his programmer's bag of tricks to win a Swoopo auction -- and, predictably, failed.

With all of this data available, I concluded that there is no way to reliably win an auction on swoopo.com without using their bidbutler service. There are delays on their network/servers in processing manual bids, whether intentional or just due to bad design, that cause manual bids placed with 1 or 2 seconds remaining not to be cast. users of their bidbutler service have an unfair advantage in that their bids are placed on the server side and are not subject to these delays.

Since it is not possible to reliably place manual bids, the only way to guarantee that an auction can be won (while still coming out ahead) is to use the site's bidbutler service with high ceilings on the number of bids and amount that one will let it bid up to. Those ceilings have to take into account the item's current price, and will be lower the longer an item is being bid on.

As Joshua's data shows, there is no way to win a Swoopo auction other than through sheer random chance -- that is, your client-side bid happens to wind its way through the umpteen internet routers between the server and your computer in time, ending up at the top of a queue with dozens or hundreds of other bids placed within a fraction of a second of each other. And what's worse, you may not have any chance at all, unless you place a server-side bet through their exploitatively expensive "bidbutler" service.

As I said in my original post, the only winning strategy at Swoopo, or any other penny auction site, is not to play. On Swoopo, there are nothing but millions of losers -- and by that I mean they are gambling and losing millions of real dollars to the house. Which would be OK, I guess, if it was properly regulated as gambling. Swoopo and all these other penny auction sites should be regulated and classified as the online gambling sites in sheep's clothing they really are.

Let's see what we can do to hasten this process along. Warn your friends and family. Complain to the Better Business Bureau and other regulatory agencies. And if you feel as strongly as I do about this, please write your congressmen/women and urge them to regulate these exploitative penny auctions.

Discussion

How to Motivate Programmers

There's an inherent paradox in motivating programmers. I think this Geek Hero Comic illustrates it perfectly:

geek-hero-panel-1.png

geek-hero-panel-2.png

It's a phenomenon I've noticed even in myself. Nothing motivates like having another programmer tell you they're rewriting your code because it sucks. Dave Thomas has talked about this for years in his classic Developing Expertise presentation, supported by the following quote:

Interestingly enough, a friend of mine (who is a quality control manager in a hospital) often makes identical statements in reference to doctors: Polite requests, coercion, etc. are useless at best and often detrimental. Peer pressure and competition are the key.

Don't try to race sheep,
Don't try to herd race horses

Yes, the use of the term sheep is mildly derogatory, but the general principle is sound: use motivational techniques that are appropriate to the level of developers you're working with. If you have neophyte developers, herd them with maxims, guidelines and static rules. If you have experienced developers, rules are less useful. Instead, encourage them to race: engage in a little friendly competition and show off how good they are to their peers.

Discussion