Coding Horror (Page 219)

9 Aug 2005

TryParse and the Exception Tax

In .NET 1.1, TryParse is only available for the Double datatype. Version 2.0 of the framework extends TryParse to all the basic datatypes. Why do we care? Performance. Parse throws an exception if the conversion from a string to the specified datatype fails, whereas TryParse explicitly avoids throwing an exception.

Julia Lerman showed a screenshot of a cool little demo app demonstrating the performance difference between Parse and TryParse in a presentation of hers (ppt) that I stumbled across a few months ago. I was shocked how much faster TryParse was! I knew exceptions were slow, but… wow.

The original source for this sample app was a BCL team blog entry from way back in October 2003. That code sample is pretty ancient by now, so I thought I'd pick it up and update it so that it at least loads in VS.NET 2005 beta 2. In the process of doing this, I found out that this little sample app has.. er.. some bugs. A lot of bugs, actually. Bugs that made this dramatic performance difference not so dramatic any more:

So, yeah, parsing with exceptions is quite a bit slower, but not "did someone just downgrade my computer to a 486?" slower. The general rule of avoiding exceptions in your primary code paths still applies. That said, it's not unreasonable to use exceptions for program flow when the situation warrants it, as Alex Papadimoulis points out:

I think that there's a general consensus out there that Exceptions should be limited to exceptional circumstances. But "exceptional" is a rather subjective adjective, so there's a bit of a gray area as to what is and isn't an appropriate use of Exceptions.
Let's start with an inappropriate use that we can all agree on. I can think of no better place to find such an example than TheDailyWTF.com. Although that particular block of code doesn't exactly deal with throwing exceptions, it is a very bad way of handling exceptions. At the other extreme, exceptions are appropriate for handling environment failure. For example, if your database throws "TABLE NOT FOUND," that would be the time to catch, repackage, and throw an exception.

But it's in the middle where there's a bit of disagreement. One area in particular I'd like to address in this post is exceptions to business rules. I mentioned this as an appropriate before, but noticed there was quite a bit of disagreement with that. But the fact of the matter is, exceptions really are the best way to deal with business rule exceptions.

Alex concludes that, in this case, using exceptions to propagate errors across the tiers is a better solution. He's willing to pay the exception performance tax:

Less code, less mess. Nanoseconds slower? Probably. A big deal in the context of a web request going through three physical tiers? Not at all.

I totally agree. As long as you're aware of the cost, this is a perfectly reasonable thing to do. An iron-clad adherence to the "avoid all exceptions" rule would be a net loss.

Download the updated TryParse demo VS.NET 2005 solution (17kb zip)

Discussion

8 Aug 2005

How to fit three bugs in 512 bytes of security code

In the spirit of iPod modem hacking, Michael Steil documents how hackers compromised the Xbox security system. Mostly thanks to 512 bytes of rather buggy security code embedded in the Xbox boot ROM:

The Xbox is an IBM PC, i.e. it has an x86 CPU. When the machine is turned on, it starts execution 16 bytes from the top of its address space, at the address FFFF_FFF0 (F000:FFF0). On an IBM PC, the upper 64 KB (or more) of the address space are occupied by the BIOS ROM, so the CPU starts execution in this ROM. The Xbox, having an external (reprogrammable) 1 MB Flash ROM chip (models since 2003 have only 256 KB), would normally start running code there as well, since this megabyte is also mapped into the uppermost area of the address space. But this would make it too easy for someone who wants to either replace the ROM image with a self-written one or patch it to break the chain of trust ("modchips"). If the ROM image could be fully accessed, it would be easy to reverse-engineer the code; encryption and obfuscation would only slow down the hacking process a bit.
A common idea to make the code inaccessible is not to put it into an external chip, but integrate it into one of the other chips. Then there is no standard way to extract the data, and none to replace the chip with one with different contents. But this way, it is a lot more expensive, both the design of a chip that includes both ROM and additional logic, and updating the ROM in a new version of the Xbox if there is a flaw in the ROM.
A good compromise is to store only a small amount of code in one of the other chips, and store the bulk of it in the external Flash chip. This small ROM can not be extracted easily, and it cannot be changed or replaced. The code in there just has to make sure that an attacker can neither understand nor successfully patch the bulk of the code he has access to, which is stored in Flash ROM.
Microsoft decided to go this way, and they stored 512 bytes of code in the Xbox' Southbridge, the MCPX (Media and Communications Processor for Xbox), which is manufactured by nVidia. This code is supposed to be mapped into the uppermost 512 bytes of the address space, so that the CPU starts execution there. It includes a decryption function with a secret key that deciphers (parts of the) "unsafe" code in the Flash ROM into RAM and runs it. Without knowing the key, it is practically impossible to understand or even patch the encrypted code in Flash ROM.

I know virtually nothing about cryptography, and I could have told you that checking a single 32-bit value is a remarkably poor substitute for a real hash.

I'm thinking Microsoft won't be making these kinds of newbie security mistakes with the Xbox 360. Current rumor suggests we won't have to wait long to find out-- the 360 will supposedly go on sale on or near black Friday.

Discussion

7 Aug 2005

Consolas and ClearType

You know you've entered the highest pantheons of geekhood when you get excited about Microsoft's new fixed-width font, Consolas. I am always on the lookout for a better fixed-width programming font. After reading Scott's post, and then Steve's post, I was intrigued enough to copy it from a Vista install on to my XP box.

And that's when the disappointment set in. Here's Lucida Console, 9 point. Just to clear your visual palate.

Consolas, 10 point, standard font smoothing. MY EYES! THE GOGGLES! THEY DO NOTHING!

Consolas, 10 point, ClearType font smoothing.

I'll definitely agree that Consolas is one of the best looking ClearType fonts I've ever seen. That's probably because it is part of the first font family designed from scratch with ClearType hinting in mind.

However, I prefer not to use font smoothing on my programming fonts. And Consolas looks like crap without ClearType! Consolas appears to lack any kind of hinting for reasonable display at small point sizes. Consolas isn't just optimized for ClearType, it can barely be used without it.

Well, so much for that. Consolas, you are dead to me. Here's hoping someone at Microsoft wises up and adds the normal font hinting so Consolas displays legibly at 9 to 13 points.*

For the record, I am not anti-ClearType. On a high DPI display-- think 15" laptop display with a resolution of 1600x1200-- I definitely like it. But on a display with a more typical DPI, say a typical 19" 1280x1024 panel, the ClearType RGB pixel noise around the fonts is extremely fatiguing to my eyes. Particularly when reading fixed-width programming fonts.

Now, before you write me off as a font hatin' luddite, let me point out that Rick Strahl has almost exactly the same problem with Consolas, ClearType, and programming fonts that I do. It's a great technology, but it's also a high-DPI display technology, and Windows sucks for high DPI displays. That's a huge disconnect. And it won't be resolved until Windows Vista ships.

* If the whole hinting thing doesn't work out between us, it's good to know that Consolas can find some alternative work in spanish-speaking countries.

Discussion

4 Aug 2005

Option Strict and Option Explicit in VB.NET 2005

I just noticed that Option Explicit is on by default for new VB solutions in Visual Studio .NET 2005:

It's about damn time.

There's nothing more vicious than making an innocent typo when referencing a variable and not knowing about it because the compiler silently declares a new variable for you:

MyStringName = "Pack my box with five dozen liquor jugs"
Console.WriteLine(MyStringNam)

Just talking about it makes my eye twitch uncontrollably. It's almost as bad as making a language case sensitive.

Option Explicit Off is pure, unmitigated evil, yet Option Explicit Off is the default in VS.NET 2002 and 2003. I've audited a half-dozen VB.NET projects where, months into development, the developers didn't realize that it was off! Laugh all you want, but this is the power of default values.

Paul Vick pointed out that VS.NET 2002 and later do in fact ship with Option Explicit On set by default. What I really needed was an option not to work with knuckleheads who turn it off, because I got bitten with this one a few times.

I'm not sure that Option Strict is quite the no-brainer that Option Explicit is, but Dan Appleman sure has strong feelings about it:

One of the debates that has arisen with the arrival of Visual Basic .NET is the use of Option Strict. Option Strict turns on strong type checking. You've probably heard about "evil type coercion" (pdf) in Visual Basic 6 -- VB6's habit of converting data types automatically based on its best guess of what you want to do. While this can be a convenience to programmers, it can also lead to obscure and unexpected bugs when VB's guess does not correspond to what you intended.
The incorporation of strict type checking into Visual Basic .NET represents one of the most important improvements to the language. Unfortunately, Microsoft showed a stunning lack of confidence in their decision to incorporate it by leaving Option Strict off by default. In other words, when you create a new VB.NET project, strict type checking remains off.
Some argue that this is a good thing. Leaving Option Strict off allows VB.NET to automatically convert data types in the same way as VB6. Not only that, but with strict type checking off, VB.NET can automatically perform late binding on Object variables in the same way as VB6 (where a variable is of type "Object', VB will perform a late bound call on the object, correctly calling the requested method if it exists).
The people who make these arguments are wrong.
You should ALWAYS turn Option Strict On for every application.

He also calls this Option Slow, referring to the slow, expensive IL that must be emitted behind the scenes for this magical type conversion scheme to work-- the source of endless "VB.NET is slower than C#" benchmarks.

I tend to agree that this probably shouldn't be off by default, but it's nowhere near as poisonous as Option Explicit. Option Explicit Off has no legitimate use. Option Strict Off has one clear use case: it's great when you're writing a lot of late binding code. Let the IL deal with all the nasty, verbose type conversions. As Scott points out, we can now use partial classes in VB.NET 2.0 to mark selected sections of code Option Strict Off while leaving the rest Option Strict On. It's the best of both worlds.

I guess I could be critical of Microsoft for not having the balls to also turn Option Strict on by default, but I consider it a minor miracle that we even got Explicit. I'll take it.

Discussion

3 Aug 2005

Does Having The Best Programmers Really Matter?

Joel has a lengthy entry in which he asks, does having the "best programmers" really matter?

This is something I've talked about before: extreme skill disparity is unique to the profession of software development. The odds of working with a genius or a jackass on any given job are about fifty-fifty.

What's worse is that there's no correlation between experience and skill, either. I've worked with amazingly talented interns and 20 year vets who produced terrible code. Joel provides additional data to support this hypothesis:

The data I rely upon comes from Professor Stanley Eisenstat at Yale. Each year he teaches a programming-intensive course, CS 323, where a large proportion of the work consists of about ten programming assignments. The assignments are very serious for a college class: implement a Unix command-line shell, implement a ZLW file compressor, etc.
There was so much griping among the students about how much work was required for this class that Professor Eisenstat started asking the students to report back on how much time they spent on each assignment. He has collected this data carefully for several years.
I spent some time crunching the data; it's the only data sets I know of where we have dozens of students working on identical assignments using the same technology at the same time. It's pretty darn controlled, as experiments go.

Here's a representative sample for a single CS 323 assignment, showing time spent versus grade/score:

hours vs. score scatter plot for a Yale CS 323 assignment

There is absolutely no correlation whatsoever between time spent on the problem and the resulting score for these CS students. It's not much of a stretch to extend this to time spent in the profession of software development.

The rest of Joel's post veers dangerously close to being self-serving-- look at my awesome company and the smart employees I hired! but at least he's finally acknowledging that he is really talking about a very narrow niche:

It's not just a matter of "10 times more productive." It's that the "average productive" developer never hits the high notes that make great software.
Sadly, this doesn't really apply in non-product software development. Internal, in-house software is rarely important enough to justify hiring rock stars. Nobody hires Dolly Parton to sing at weddings. That's why the most satisfying careers, if you're a software developer, are at actual software companies, not doing IT for some bank.

Since Joel's commercial shrink-wrap software development is a tiny percentage of the overall IT market, does this mean that having the best programmers doesn't matter most of the time?

Sadly, I believe the answer is yes.

Discussion