My Buddy, Regex

I generally don't subscribe to the UNIX religion, but there is one area where I am an unabashed convert: regular expressions.

Yeah, the syntax is a little scary, but for processing strings, nothing is more effective. The RegEx is the power drill of the programmer's toolkit: not appropriate for every job, but the go-to tool for a lot of common jobs. And what could be more common than the humble string, particularly in this day and age of HTML, XML, SOAP, and other plain text formats? Most modern development languages have complete Regular Expression support – even in the IDE for things like search and replace.

Over the last four years I've experimented with a number of commercial, freeware, and even homegrown RegEx tools. In the .NET era, I started with Expresso, and I recently found out about Regulator, which is hands down the most impressive free RegEx tool I've encountered to date. But that was before I met my new best friend, RegexBuddy:

screenshot of RegexBuddy

I belatedly realized after I created this screenshot I may have accidentally picked the complicated "run away screaming" example. Great for me as an intermediate regex user, but not so great for introducing people to the miracle of RegEx. So let me apologize by way of explanation: this RegEx captures all valid HTML 4.0 tags. It also exploits a very powerful feature called named captures see the ?<element> and ?<attr> highlighted in that tannish-brown? In .NET you can refer to those matches with a very simple, logical syntax:

Dim mc As MatchCollection = reg.Matches(strHTML)
Dim m As Match
For Each m In mc
m.Groups("element").ToString
m.Groups("attr").ToString
Next

The one unique, killer feature that RegexBuddy has is super fast, real-time highlighting of all possible matches as you type the regular expression. That has always been my complaint about RegEx composition: it's difficult to tell beforehand what the effect of your RegEx will be until you "run" it and browse all the matches. With RegexBuddy, you don't have to – just type and watch. No running required. But that's not the only great feature: the RegEx decomposition and pre-built library are also best of breed. Needless to say, highly recommended, and currently my preferred tool. It's not free, but TANSTAAFL.

Once you come to grips with the basics of regular expressions, you'll want a handy cheat sheet of the syntax. The best one I've found is VisiBone's JavaScript foldout. There's also an online version. All the VisiBone stuff is super cool, and brings back warm memories of those incredible Beagle Brothers posters I had for the Apple //.

However, the information density does get a little ridiculous on the VisiBone cards, so I'd go with the foldouts or the wall charts, unless you enjoy squinting a lot. If you just can't get enough, and you want to learn about the thrilling history of RegEx and understand how they work under the hood (try to envision me stifling a yawn at this point) there's also the O'Reilly book.

You may not even need to know the syntax if you can drop prebuilt RegExes into your code. Why build what you can steal? There are a number of sites with growing prebuilt repositories of regular expressions:

Drunk with the power and possibility of regular expressions, you might start thinking regular expressions can do... well, just about anything. I've been there, and let me warn you up front: they can't do recursion – or reverse matching from the rear of a string – without some mighty ugly hacks. This rules out a lot of potential uses, or at least relegates RegExps to a helper role. And that's a good thing. Despite their undeniable power, RegExps aren't a procedural programming language. In limited string processing roles, they're perfect. That's what they were designed to do. But can you imagine writing an entire application with that kind of crazy, nigh-indecipherable syntax Perl?

Read more

Stay Gold, America

We are at an unprecedented point in American history, and I'm concerned we may lose sight of the American Dream.

By Jeff Atwood · · Comments

The Great Filter Comes For Us All

With a 13 billion year head start on evolution, why haven't any other forms of life in the universe contacted us by now? (Arrival is a fantastic movie. Watch it, but don't stop there - read the Story of Your Life novella it was based on

By Jeff Atwood · · Comments

I Fight For The Users

If you haven't been able to keep up with my blistering pace of one blog post per year, I don't blame you. There's a lot going on right now. It's a busy time. But let's pause and take a moment

By Jeff Atwood · · Comments

The 2030 Self-Driving Car Bet

It's my honor to announce that John Carmack and I have initiated a friendly bet of $10,000* to the 501(c)(3) charity of the winner’s choice: By January 1st, 2030, completely autonomous self-driving cars meeting SAE J3016 level 5 will be commercially available for passenger

By Jeff Atwood · · Comments