Coding Horror

programming and human factors

Quantity Always Trumps Quality

Nathan Bowers pointed me to this five year old Cool Tools entry on the book Art & Fear.

Art & Fear: Observations On the Perils (and Rewards) of Artmaking

Although I am not at all ready to call software development "art" -- perhaps "craft" would be more appropriate, or "engineering" if you're feeling generous -- the parallels between some of the advice offered here and my experience writing software are profound.

The ceramics teacher announced on opening day that he was dividing the class into two groups. All those on the left side of the studio, he said, would be graded solely on the quantity of work they produced, all those on the right solely on its quality. His procedure was simple: on the final day of class he would bring in his bathroom scales and weigh the work of the "quantity" group: fifty pound of pots rated an "A", forty pounds a "B", and so on. Those being graded on "quality", however, needed to produce only one pot - albeit a perfect one - to get an "A".

Well, came grading time and a curious fact emerged: the works of highest quality were all produced by the group being graded for quantity. It seems that while the "quantity" group was busily churning out piles of work - and learning from their mistakes - the "quality" group had sat theorizing about perfection, and in the end had little more to show for their efforts than grandiose theories and a pile of dead clay.

Where have I heard this before?

  1. Stop theorizing.
  2. Write lots of software.
  3. Learn from your mistakes.

Quantity always trumps quality. That's why the one bit of advice I always give aspiring bloggers is to pick a schedule and stick with it. It's the only advice that matters, because until you've mentally committed to doing it over and over, you will not improve. You can't.

When it comes to software, the same rule applies. If you aren't building, you aren't learning. Rather than agonizing over whether you're building the right thing, just build it. And if that one doesn't work, keep building until you get one that does.

Discussion

Alpha, Beta, and Sometimes Gamma

As we begin the private beta for Stack Overflow later this week, I wondered: where do the software terms alpha and beta come from? And why don't we ever use gamma?

alpha character beta character

Alpha and Beta are the first two characters of the Greek alphabet. Presumably these characters were chosen because they refer to the first and second rounds of software testing, respectively.

But where did these terms originate? There's an uncited Wikipedia section that claims the alpha and beta monikers came, as did so many other things, from the golden days of IBM:

The term beta test comes from an IBM hardware product test convention, dating back to punched card tabulating and sorting machines. Hardware first went through an alpha test for preliminary functionality and small scale manufacturing feasibility. Then came a beta test, by people or groups other than the developers, to verify that the hardware correctly performed the functions it was supposed to, and that it could be manufactured at scales necessary for the market. And finally, a c test to verify final safety. With the advent of programmable computers and the first shareable software programs, IBM used the same terminology for testing software. As other companies began developing software for their own use, and for distribution to others, the terminology stuck -- and is now part of our common vocabulary.

Based on the software release lifecycle page, and my personal experience, here's how I'd characterize each phase of software development:

  1. Pre-Alpha

    The software is still under active development and not feature complete or ready for consumption by anyone other than software developers. There may be milestones during the pre-alpha which deliver specific sets of functionality, and nightly builds for other developers or users who are comfortable living on the absolute bleeding edge.

  2. Alpha

    The software is complete enough for internal testing. This is typically done by people other than the software engineers who wrote it, but still within the same organization or community that developed the software.

  3. Beta

    The software is complete enough for external testing -- that is, by groups outside the organization or community that developed the software. Beta software is usually feature complete, but may have known limitations or bugs. Betas are either closed (private) and limited to a specific set of users, or they can be open to the general public.

  4. Release Candidate (aka gamma or delta)

    The software is almost ready for final release. No feature development or enhancement of the software is undertaken; tightly scoped bug fixes are the only code you're allowed to write in this phase, and even then only for the most heinous and debilitating of bugs. One of the most experienced software developers I ever worked with characterized the release candidate development phase thusly: "does this bug kill small children?"

  5. Gold

    The software is finished -- and by finished, we mean there are no show-stopping, little-children-killing bugs in it. That we know of. There are probably numerous lower-prority bugs triaged into the next point release or service pack, as well.

These phases all sound perfectly familiar to me, although there are two clear trends:

  • The definition of beta grows more all-encompassing and elastic every year.
  • We are awfully eager to throw alpha quality code over the wall to external users and testers.

In the brave new world of web 2.0, the alpha and beta designations don't mean quite the same things they used to. Perhaps the most troubling trend is the perpetual beta. So many websites stay in perpetual beta, it's almost become a running joke. GMail, for example, is still in beta after over four years!

Although I've seen plenty of release candidates in my day, I've rarely seen a "gamma" or "delta". Apparently Flickr used it for a while in their logo, after heroically soldiering on from beta:

flickr: beta, gamma, love

"loves you" is certainly more fun than "gold", but I'm not sure it's ever the same as done. Maybe that's the way it should be.

Discussion

Is Money Useless to Open Source Projects?

In April I donated $5,000 of the ad revenue from this website to an open source .NET project. It was exciting to be able to inject some of the energy from this blog into the often-neglected .NET open source ecosystem.

As I mentioned at the time, I used a very hands off approach. While I did have some up-front criteria for the award (open source license, public source control, accepts outside source contributions) it's basically a no-strings grant.

The real money is being sent via wire transfer to Dario Solera, the ScrewTurn Wiki project coordinator. What's Dario going to do with this money? You'll have to ask him. That's not for me to decide. There are no strings attached to this money of any kind. I trust the judgment of a fellow programmer to run their project as they see fit.

When I said the project could do whatever they saw fit with the money, I meant it. Buy liquor and cigarettes, throw a huge party, play it on the ponies. I'm not kidding. As long as the project team believes it's a valid way to move their project forward, whatever they say goes. It's their project, and their grant.

I hadn't heard anything from Dario, and I was curious, so I followed up with him via email. He sent back this response:

The grant money is still untouched. It's not easy to use it. Website hosting fees are fully covered by ads and donations, and there are no other direct expenses to cover. I thought it would be cool to launch a small contest with prizes for the best plugins and/or themes, but that is not easy because of some laws we have here in Italy that render the handling of a contest quite complex.

What would you suggest?

I was crushingly disappointed to find out the $5,000 in grant money has been sitting in the bank for the last four months, totally unused. That's painful to hear, possibly the most painful of all outcomes. Why did we bother doing this if nothing changes?

My friend Jon Galloway warned me this might happen. I didn't believe him. But what other conclusion can I draw at this point? He was right:

Open Source is to Traditional Software as Terror Cells are to Large Standing Armies – if you gave a terrorist group a fighter jet, they wouldn't know what to do with it. Open source teams, and culture, have been developed such that they're almost money-agnostic. Open source projects run on time, not money. So, the way to convert that currency is through bounties and funded internships. Unfortunately, setting those up takes time, and since that's the element that's in short supply, we're back to square one.

I had hoped that $5,000 grant money would be converted into something that furthered an open source project – perhaps something involving the community and garnering more code contributions. But apparently that's more difficult than anyone realized.

Jon offered these ideas:

  • Can they turn the money over to a company or organization who is familiar with this kind of thing, like the Google Summer of Code, etc.?
  • Often times, documentation and marketing are in really short supply. Could they just hire a technical writer and / or marketing expert with the $5k?
  • SourceForge has a donations program in which people can make donations to pay developers. Maybe he can run the money through there?

I must admit I'm at a bit of a loss here. Do you have any ideas for how the Screwturn Wiki project can use their $5,000 open source grant effectively? If so, please share them in the comments here, or on the ScrewTurn forum – in the Suggestions and Feature Requests area.

Even I'm not naive enough to suggest that money can solve every open source software problem. But I don't have a lot of time to contribute; I only have advertising revenue. I'm absolutely dumbfounded to learn that contributing money isn't an effective way to advance an open source project. Surely money can't be totally useless to open source projects… can it?

Discussion

Understanding The Hardware

I got a call from Rob Conery today asking for advice on building his own computer. Rob works for Microsoft, but lives in Hawaii. I'm not sure how he managed that, but being so far from the mothership apparently means he has the flexibility to spec his own PC. Being stuck in Hawaii is, I'm sure, a total bummer, dude.

Rob and I may disagree on pretty much everything from a coding perspective, but we can agree on one thing: we love computers. And what better way to celebrate that love by building your own? It's not hard. This industry was built on the commodification of hardware. If you can snap together a Lego kit, you can build a computer.

Maybe this is a minority opinion, but I find understanding the hardware to be instructive for programmers. Peter Norvig -- now director of research at Google -- appears to concur.

Understand how the hardware affects what you do. Know how long it takes your computer to execute an instruction, fetch a word from memory (with and without a cache miss), transfer data over ethernet (or the internet), read consecutive words from disk, and seek to a new location on disk.

In my book, one of the best ways to understand the hardware is to get your hands dirty and put one together, including installing the OS, yourself. It's a shame Apple programmers can't do this, as their hardware has to be blessed by the Cupertino DRM gods. Or, you could build a frankenmac, though you'll run the risk of running a "patched" OS X indefinitely.

As Rob and I were talking about the philosophy of building your own development PC -- something I also discussed on a Hanselminutes podcast -- he said you know, you should blog this. But Rob -- I already have, many times over! Let's walk down the core list of components I recommended for Rob, and I'll explain my choices with links to the relevant blog posts I've made on that particular topic.

ASUS P5E Intel X38 motherboard ($225)

I'm a big triple monitor guy, so I insist on motherboards that are capable of accepting two video cards -- in other words, they have two x8 or x16 PCI Express card slots suitable for video cards. I also demand quiet from my PC, which means a motherboard with all passive cooling. Beyond that, I don't like to pay a lot for a fancy motherboard. After spending the last five years with motherboards packing scads of features I never end up using (two ethernet ports, anyone?), I've realized there are better ways to invest your money. People tend to respect ASUS as one of the largest and most established Taiwanese OEMs, so it's usually a safe choice. I'd go as far down on price on the motherboard as you can without losing whatever essential features you truly need. Save that money for the other parts.

Intel Core 2 Duo E8500 3.16 GHz CPU ($190)
Intel Core 2 Quad Q9300 2.5 GHz CPU ($270)

Ah, the eternal debate: dual versus quad. Despite what Intel's marketing weasels might want you to believe, clock speed still matters very much. Here's an example: SQL Server 2005 queries on my local box, a 3.5 GHz dual core, execute more than twice as fast as on our server, a 1.8 GHz eight core machine. Sadly, very few development environments parallelize well, with the notable exception of C++ compilers. Outside of a few niche activities, such as video encoding and professional 3D rendering, most computing tasks don't scale worth a damn beyond two cores. Yes, it's exciting to see those four graphs in Task Manager (and even I get a little giddy when I see sixty-four of 'em), but take a look at the cold, hard benchmark data and the contents of your wallet before letting that seductive 4 > 2 math hijack the rational parts of your brain.

It's also smart to buy a little below the maximum, with the ultimate goal of upgrading to a whizzy-bang 4 GHz quad core CPU sometime in the future. One of the hidden value propositions in building your own PC is the ability to easily upgrade it later. CPU is one of the most obvious upgrade points where you want to intentionally underbuy a little. Give yourself some room for future upgrades. Until a quad costs the same as a dual at the same clock speed, my vote still goes to the fastest dual core you can afford.

Kingston 4GB (2 x 2GB) DDR2 800 x 2 ($156)

Memory is awesomely cheap. When it comes to memory, I like to buy a few notches above the cheapest stuff, and Kingston has been a consistently reliable brand for me at that pricing level. There's no reason to bother with anything under 8 GB these days. Don't get hung up on memory speed, though. Quantity is more important than a few extra ticks of speed. But don't take my word for it. As an experiment, Digit-Life cut the speed of memory in half, with a resulting overall average performance loss of merely three percent. By the time your system has to reach outside of the L1, L2, and possibly even L3 cache -- it's already so slow from the system's perspective as to be academic. Memory that is a few extra nanoseconds faster isn't going to make any difference. This is also why I specified the latest and greatest Intel CPUs with larger 6 MB L2 caches. Remember, kids, Caching Is Fundamental!

Western Digital VelociRaptor 300 GB 10,000 RPM Hard Drive ($290)

This is arguably the only indulgence on the list. The Velociraptor is an incredibly expensive drive, but it's also a rocket of a hard drive. I'm a big believer in the importance of disk speed to overall system performance, particularly for software developers. At least Scott Guthrie backs me up on this one. Trust me, you want a 10,000 RPM boot drive. Buy a slower large drive for your archiving needs. You want two drives, anyway; having two spindles will give you a lot of flexibility and also help your virtual machine performance immensely.

This new raptor model is the best of the series. It's much quieter, uses less power, generates less heat, and is by far the fastest -- embarrassingly fast. It's expensive, yes. I won't hold it against you if you decide to disregard this advice and go with a respectably fast, less expensive hard drive. But to me, it's all about putting the money where the most significant bottlenecks are, and considered in that light -- man, this thing is so worth it. As Storage Review said, "[its] single-user scores .. blow away those of every other [hdd]".

Radeon HD 4850 512MB video card ($155 after rebate)

Even if you're not a gamer, it's hard to ignore the charms of this amazing powerhouse of a video card. The brand new ATI 4850 delivers performance on par with the very fastest $500+ video card you can buy for a measly hundred and fifty bucks! Modern operating systems require video grunt, either for windowing effects or high-definition video playback. Beyond that, it's looking more and more like some highly parallizable tasks may move to the GPU. Have you ever read stuff like "even the slowest GPU implementation was nearly 6 times faster than the best-performing CPU version"? Get used to reading statements like that; I expect you'll be reading a lot more of them in the future as general purpose APIs for GPU programmability become mainstream. That's another reason, as a programmer and not necessarily a gamer, you still want a modern video card. For all this talk of coming 8 and 16 core CPUs, eventually the GPU could be the death of the general purpose CPU.

We also want our video card to be efficient. Many don't realize this, but your video card can consume as much power as your CPU. Sometimes even more! The 4850, for all its muscle, is remarkably efficient as well. According to a recent AnandTech roundup, it's on par with the most efficient cards of this generation. Pay attention to your idle power consumption, because power consumed means heat produced, which in turn means additional noise and possible instability.

Corsair 520HX 520W Power Supply ($100 after rebate)

The power supply is probably one of the most underrated and misunderstood components of a modern PC. First, because people tend to focus on the "watts" number when the really important number is actually efficiency -- a certain percentage of energy that goes into every power supply is turned into waste heat. An efficient power supply will run cooler and more reliably because it uses higher quality parts. People think you need 1.21 Jigawatts to run a powerful desktop system, but that's just not true. Unless you have a bleeding-edge CPU paired with two high-end top of the line gaming class video cards, trust me -- even 500 watts is overkill.

The Corsair model I recommend gets stellar reviews. It has modular cables and the 80 plus designation, so it's 80% efficient at all input voltages. Note that a quality power supply is not a substitute for a quality UPS or surge protector, but it helps.

Scythe "Ninja" SCNJ-2000 cooler ($50)
Scythe "Ninja Mini" SCMNJ-1000 cooler ($35)

I'll be honest with you. I have a giant heatsink fetish. These giant hunks of aluminum and copper, and the liquid-filled heatpipes that drive them, fascinate me. But there's a more practical reason, as well: if you want a quiet computer, you don't even bother with the stock coolers that are bundled with the CPU. Over the last few years, I keep coming back to Scythe's classic "Ninja" tower cooler, which is available in tall and short varieties. They're so astoundingly efficient that, with adequate case ventilation, they can be run fanless. I even (barely) managed to squeeze the Ninja Mini into my home theater PC build, and it's now mercifully fanless as well. There are plenty of other great tower/heatpipe coolers on the market, but the Ninja is still one of the best, a testament to its pioneering design. The CPU is (usually) the biggest consumer of power in your PC, so it's sensible to invest in a highly efficient aftermarket cooler to keep noise and heat at bay under load.

There you have it. More than you ever possibly wanted to know about how an obsessive geek builds a PC -- painstakingly analyzing every single part that goes into it. Now, like Rob, you're probably sorry you asked; who needs all the philosophical digressions, just give us the damn parts list! OK, here it is:

The best bang for the buck developer x86 box I can come up with, all for around $1100.

I try to avoid posting about hardware too much, but sometimes I can't help myself. I blame Rob. Enjoy your new system, Mr. Conery.

Discussion

Coding Without Comments

If peppering your code with lots of comments is good, then having zillions of comments in your code must be great, right? Not quite. Excess is one way good comments go bad:

'*************************************************
' Name: CopyString
'
' Purpose: This routine copies a string from the source
' string (source) to the target string (target).
'
' Algorithm: It gets the length of "source" and then copies each
' character, one at a time, into "target". It uses
' the loop index as an array index into both "source"
' and "target" and increments the loop/array index
' after each character is copied.
'
' Inputs: input The string to be copied
'
' Outputs: output The string to receive the copy of "input"
'
' Interface Assumptions: None
'
' Modification History: None
'
' Author: Dwight K. Coder
' Date Created: 10/1/04
' Phone: (555) 222-2255
' SSN: 111-22-3333
' Eye Color: Green
' Maiden Name: None
' Blood Type: AB-
' Mother's Maiden Name: None
' Favorite Car: Pontiac Aztek
' Personalized License Plate: "Tek-ie"
'*************************************************

I'm constantly running across comments from developers who don't seem to understand that the code already tells us how it works; we need the comments to tell us why it works. Code comments are so widely misunderstood and abused that you might find yourself wondering if they're worth using at all. Be careful what you wish for. Here's some code with no comments whatsoever:

r = n / 2;
while ( abs( r - (n/r) ) > t ) {
  r = 0.5 * ( r + (n/r) );
}
System.out.println( "r = " + r );

Any idea what that bit of code does? It's perfectly readable, but what the heck does it do?

Let's add a comment.

// square root of n with Newton-Raphson approximation
r = n / 2;
while ( abs( r - (n/r) ) > t ) {
  r = 0.5 * ( r + (n/r) );
}
System.out.println( "r = " + r );

That must be what I was getting at, right? Some sort of pleasant, middle-of-the-road compromise between the two polar extremes of no comments whatsoever and carefully formatted epic poems every second line of code?

Not exactly. Rather than add a comment, I'd refactor to this:

private double SquareRootApproximation(n) {
  r = n / 2;
  while ( abs( r - (n/r) ) > t ) {
    r = 0.5 * ( r + (n/r) );
  }
return r;
}
System.out.println( "r = " + SquareRootApproximation(r) );

I haven't added a single comment, and yet this mysterious bit of code is now perfectly understandable.

While comments are neither inherently good or bad, they are frequently used as a crutch. You should always write your code as if comments didn't exist. This forces you to write your code in the simplest, plainest, most self-documenting way you can humanly come up with.

When you've rewritten, refactored, and rearchitected your code a dozen times to make it easy for your fellow developers to read and understand – when you can't possibly imagine any conceivable way your code could be changed to become more straightforward and obvious – then, and only then, should you feel compelled to add a comment explaining what your code does.

As Steve points out, this is one key difference between junior and senior developers:

In the old days, seeing too much code at once quite frankly exceeded my complexity threshold, and when I had to work with it I'd typically try to rewrite it or at least comment it heavily. Today, however, I just slog through it without complaining (much). When I have a specific goal in mind and a complicated piece of code to write, I spend my time making it happen rather than telling myself stories about it [in comments].

Junior developers rely on comments to tell the story when they should be relying on the code to tell the story. Comments are narrative asides; important in their own way, but in no way meant to replace plot, characterization, and setting.

Perhaps that's the dirty little secret of code comments: to write good comments you have to be a good writer. Comments aren't code meant for the compiler, they're words meant to communicate ideas to other human beings. While I do (mostly) love my fellow programmers, I can't say that effective communication with other human beings is exactly our strong suit. I've seen three-paragraph emails from developers on my teams that practically melted my brain. These are the people we're trusting to write clear, understandable comments in our code? I think maybe some of us might be better off sticking to our strengths – that is, writing for the compiler, in as clear a way as we possibly can, and reaching for the comments only as a method of last resort.

Writing good, meaningful comments is hard. It's as much an art as writing the code itself; maybe even more so. As Sammy Larbi said in Common Excuses Used To Comment Code, if your feel your code is too complex to understand without comments, your code is probably just bad. Rewrite it until it doesn't need comments any more. If, at the end of that effort, you still feel comments are necessary, then by all means, add comments … carefully.

Discussion