Coding Horror

programming and human factors

Mastering GUIDs with Occam's Razor

Do you remember the scene from the movie Full Metal Jacket where the marines recite the USMC creed?

full-metal-jacket.jpg

It's a little known fact, but programmers have a similar creed:

This is my GUID. There are many like it but this one is mine. My GUID is my best friend. It is my life. I must master it as I must master my life. Without me, my GUID is useless. Without my GUID I am useless.

In fact, GUIDs are so near and dear to our hearts that we recently had a spirited discussion about them at work. Let's say you had a string and needed to determine whether it was a valid GUID. The easy way is a .Parse() style Try-Catch code block:

guid g;
try
{
g = new Guid("x");
}
catch
{
}

This is the correct answer.. most of the time. But you know programmers. They never met an edge condition they didn't enjoy discussing ad nauseam. And I was one of the first to chime in:

This is definitely a good way to validate a data type, however, just be aware of the exception performance penalty. Throwing exceptions on failure to cast is expensive, so if this is something that
  • will be invalid often
  • appears in a loop
  • occurs with high frequency

then you'd want to go with a non-exception based check. However most of the time none of these things are true, so the performance is irrelevant.

Then someone suggested trying a regular expression. Oh great, now we have two problems:

Regex r = new Regex(
"^((?-i:0x)?[A-Fa-f0-9]{32}|
[A-Fa-f0-9]{8}-[A-Fa-f0-9]{4}-[A-Fa-f0-9]{4}-[A-Fa-f0-9]{4}-[A-Fa-f0-9]{12}|
{[A-Fa-f0-9]{8}-[A-Fa-f0-9]{4}-[A-Fa-f0-9]{4}-[A-Fa-f0-9]{4}-[A-Fa-f0-9]{12}})$");

It's valid, but I couldn't resist tweaking this regex for simplicity's sake. The official GUID spec only defines one format for GUID strings, the familiar 8-4-4-4-12 format:

Regex r = new Regex(
@"^({|()?[A-Fa-f0-9]{8}-([A-Fa-f0-9]{4}-){3}[A-Fa-f0-9]{12}(}|))?$");

This is my post, so I'll skip the part where others poked holes in my regex. Just when we thought it was over, a fellow developer whipped out a code snippet that benchmarks how long it takes to validate GUIDs via each method:

static void Main(string[] args)
{
Guid g = Guid.NewGuid();
string s = g.ToString();
DateTime before = DateTime.Now;
for (int i = 0; i < 10000; i++)
{
bool retVal = IsGuid(s);
}
Console.WriteLine(DateTime.Now.Subtract(before));
before = DateTime.Now;
for (int i = 0; i < 10000; i++)
{
bool retVal = IsGuid2(s);
}
Console.WriteLine(DateTime.Now.Subtract(before));
Console.ReadLine();
}
public static bool IsGuid(string guidString)
{
try
{
Guid guid = new Guid(guidString);
return true;
}
catch
{
return false;
}
}
public static bool IsGuid2(string guidString)
{
Regex r;
r = new Regex(
@"^({|()?[A-Fa-f0-9]{8}-([A-Fa-f0-9]{4}-){3}[A-Fa-f0-9]{12}(}|))?$");
Match m = r.Match(guidString);
if (m.Success)
return true;
else
return false;
}

According to this, constructor validation is 3 to 4 times faster than the regex.. or is it? I immediately noticed a few problems that made this a rather questionable benchmark. And, as before, I couldn't resist investigating:

If I increase the iterations to 100,000:
00.1874856
00.7968138

You typically wouldn't want to create a new regex inside the loop, because it's too expensive. If I move the regex creation outside the loop:

00.2031094
00.5780806

If I set RegexOptions.Compiled on the regex:

00.1874856
00.3437236

If I run the above with CTRL+F5 (sans debugger):

00.1718673
00.1874916

It was definitely a fun discussion. I certainly learned a few things about GUIDs I didn't know. Heck, discussions like this are why I joined a software development company in the first place. But it's also a pointless discussion.

Performance was a complete non-issue in this particular scenario. That's why we should always program with Occam's Razor in mind:

Given two similar code paths, choose the simpler one.

Edge conditions and fancy techniques are interesting, but they're not necessarily a worthwhile use of time. Sometimes the simple and stupid solution is all you need.

Written by Jeff Atwood

Indoor enthusiast. Co-founder of Stack Exchange and Discourse. Disclaimer: I have no idea what I'm talking about. Find me here: http://twitter.com/codinghorror