To Compile or Not To Compile

I am currently in the middle of a way-overdue refactoring of MhtBuilder, which uses regular expressions extensively. I noticed that I had sort of mindlessly added RegexOptions.Compiled all over the place. It says "compiled" so it must be faster, right? Well, like so many other things, that depends:

In [the case of RegexOptions.Compiled], we first do the work to parse into opcodes. Then we also do more work to turn those opcodes into actual IL using Reflection.Emit. As you can imagine, this mode trades increased startup time for quicker runtime: in practice, compilation takes about an order of magnitude longer to startup, but yields 30% better runtime performance. There are even more costs for compilation that should mentioned, however. Emitting IL with Reflection.Emit loads a lot of code and uses a lot of memory, and that's not memory that you'll ever get back. In addition. in v1.0 and v1.1, we couldn't ever free the IL we generated, meaning you leaked memory by using this mode. We've fixed that problem in Whidbey. But the bottom line is that you should only use this mode for a finite set of expressions which you know will be used repeatedly.

In other words, this is something you don't want to do casually, as I was. And 30% faster isn't a very compelling performance gain to balance against those serious tradeoffs. Unless you're in a giant loop, or processing humongous strings, it's almost never worth it. The MSDN documentation also has this interesting tidbit:

To improve performance, the regular expression engine caches all regular expressions in memory. This avoids the need to reparse an expression into high-level byte code each time it is used.

The second time you build your non-compiled regex, no additional interpreting overhead is incurred. And you get that for free. Even though it sounds faster and all, you probably don't want to use RegexOptions.Compiled. But what about Regex.CompileToAssembly?

This avoid the pitfalls associated with dynamic compilation by turning your regular expressions into a compiled DLL. There aren't many articles describing how to do this, but Kent Tegels dug up a few Regex articles with sample code showing how to take advantage of Regex.CompileToAssembly:

It seems ideal-- all the advantages of compilation with none of the disadvantages-- but it adds one disadvantage of its own: your regular expressions are now written in stone. You can't change them at runtime, and you have to know what you're going to do entirely up front. This might be a worthwhile tradeoff at the end of a large project that uses regular expressions extensively, but still.. only 30% faster? I'd want some actual benchmark numbers from my application before I could justify the loss of flexibility and the additional file dependency.

Read more

Stay Gold, America

We are at an unprecedented point in American history, and I'm concerned we may lose sight of the American Dream.

By Jeff Atwood · · Comments

The Great Filter Comes For Us All

With a 13 billion year head start on evolution, why haven't any other forms of life in the universe contacted us by now? (Arrival is a fantastic movie. Watch it, but don't stop there - read the Story of Your Life novella it was based on

By Jeff Atwood · · Comments

I Fight For The Users

If you haven't been able to keep up with my blistering pace of one blog post per year, I don't blame you. There's a lot going on right now. It's a busy time. But let's pause and take a moment

By Jeff Atwood · · Comments

The 2030 Self-Driving Car Bet

It's my honor to announce that John Carmack and I have initiated a friendly bet of $10,000* to the 501(c)(3) charity of the winner’s choice: By January 1st, 2030, completely autonomous self-driving cars meeting SAE J3016 level 5 will be commercially available for passenger

By Jeff Atwood · · Comments