Coding Horror

programming and human factors

Shortening Long File Paths

We're working on a little shell utility that displays paths in a menu. Some of these paths can get rather long, so I cooked up this little regular expression to shorten them. It's a replacement, so you call it like this:

static string PathShortener(string path)
{
    
const string pattern = @"^(w+:|)([^]+[^]+).*([^]+[^]+)$";
    
const string replacement = "$1$2...$3";
    
if (Regex.IsMatch(path, pattern))
    {
        
return Regex.Replace(path, pattern, replacement);
    }
    
else
    {
        
return path;
    }          
}

So, for these paths:

C:Documents and SettingsjatwoodMy DocumentsVisual Studio 2005SimpleEncryptionUnitTestsUnitTests.vb
wumpuspublicHilo DeliverablesHilo FinalIntroductionCodeIntroApp_Themescellphonephoto-small.jpg

The result is:

C:Documents and Settingsjatwood...UnitTestsUnitTests.vb
wumpuspublic...cellphonephoto-small.jpg

The general strategy is to keep the first two folders at the beginning, replace the middle with an ellipsis, and leave the final folder and filename on the end.

After spending an hour dinking around with this and testing it on a bunch of paths, a colleague pointed me to the Windows API call PathCompactPathEx, which (almost) does the same thing. Doh!

[DllImport("shlwapi.dll", CharSet = CharSet.Auto)]
static extern bool PathCompactPathEx([Out] StringBuilder pszOut, string szPath, int cchMax, int dwFlags);

static string PathShortener(string path, int length)
{
    
StringBuilder sb = new StringBuilder();
    
PathCompactPathEx(sb, path, length, 0);
    
return sb.ToString();
}

As you can see from the API definition for PathCompactPathEx, this works a little differently. It lets you set an absolute length for the path, and displays as many characters as it can with a "best fit" placement of the ellipsis. Here's the output for our two paths:

C:Documents and Settingsjatwood...UnitTests.vb
wumpuspublicHilo Deliverab...photo-small.jpg

So, which to choose? CompactPathEx guarantees that the paths will always be exactly (x) characters while displaying as much as it can, but it may not be able to split cleanly. My regex always splits cleanly, but makes no guarantees on length.

And obviously, if you're not running Windows, or if you don't care for p/invoke, the API call is clearly out.

Discussion

Open Source: Free as in "Free"

Here's Scott Hanselman on the death of nDoc:

We are blessed. This Open Source stuff is free. But it's free like a puppy. It takes years of care and feeding. You don't get to criticize a free puppy that you bring in to your home.

Free like a puppy is certainly more poignant than free as in beer. But it's an equally terrible metaphor. Nobody has to crate train NUnit. Nobody has to take NCover for regular walks. Nobody has to clean up NDoc's poop. If open source software required as much effort as raising a puppy, your local pound would be even more full than it already is of unwanted dogs.

a cute widdle puppy

The whole point of open source – the reason any open source project exists – is to save us time. To keep us from rewriting the same software over and over. Puppies are cute and fuzzy and sweet, but they're also giant timesinks. To imply that an open source project is as labor intensive as a puppy is reinforcing the very worst stereotypes of open source: software that's only free if your time is worthless.

Open source software is at its best when you aren't obligated to do anything at all.

You definitely shouldn't have to pay for it. Scott didn't:

For "base of the pyramid" fundamental stuff like Build, Test, Coverage, Docs, will we pay for them? We should. Should we have given the NDoc project $5? Did NDoc help me personally and my company? Totally. Did I donate? No, and that was a mistake.

How is that a mistake? It's exactly what open source is about: maximum benefit, minimum effort. To suggest that we are morally obligated to make monetary contributions to every open source project we benefit from shows a profound misunderstanding of the economics of open source.

Personally, as an Open Source project co-leader, I'd much rather folks who use DasBlog pick a bug and send me a patch (unified diff format) than give money. I suspect that Kevin would have been happy with a dozen engineers taking on tasks and taking on bugs in their spare time.

Contributing code to an open source project is a far greater extravagance than any monetary contribution could ever be. It's also infeasible for 99 percent of the audience– the rare few who have both the time and the ability– which makes it an even more extravagant demand.

If contributing money is foolish and contributing code is an extravagance, what's a poor user to do? Nothing. Nothing at all, that is, other than use the software.

The highest compliment you can pay any piece of open source software is to simply use it, because it's worth using. The more you use it, the more that open source project becomes a part of the fabric of your life, your organization, and ultimately the world.

Isn't that the greatest contribution of all?

Discussion

Linus Torvalds, Visual Basic Fan

Stiff recently asked a few programmers a series of open-ended questions:

  • How did you learn programming? Were schools of any use?
  • What's the most important skill every programmer should have?
  • Are math and physics important skills for a programmer?
  • What will be the next big thing in computer programming?
  • If you had three months to learn one relatively new technology, which one would you choose?
  • What are your favorite tools and why?
  • What's your favorite programming book?
  • What's your favorite non-programming book?
  • What music do you listen to?

The participants are all quite notable:

  • Linus Torvalds (Linux)
  • Dave Thomas (Pragmatic Programmer)
  • David Heinemeier Hansson (Ruby/Rails)
  • Steve Yegge (Google/Amazon)
  • Peter Norvig (Google Research Director)
  • Guido Van Rossum (Python)
  • James Gosling (Java)
  • Tim Bray (XML)

The interesting thing about open-ended questions is that the answers often reveal more about the person answering the question than they do about the question. Guido Van Rossum, for example, comes across as kind of a jerk. But the questions generally provoked some very thoughtful responses.

The most surprising response, however, was from Linus Torvalds. When asked what the "next big thing" would be in computer programming, here's part of his reply:

For example, I personally believe that Visual Basic did more for programming than Object-Oriented Languages did. Yet people laugh at VB and say it's a bad language, and they've been talking about OO languages for decades.

And no, Visual Basic wasn't a great language, but I think the easy database interfaces in VB were fundamentally more important than object orientation is, for example.

Evidently we have another inductee into the he-man object hater's club.

Maybe the moral of this story is that we should value practical aspects of a language far more heavily than relatively meaningless technical merits. Or maybe I just get a kick out of hearing Linus Torvalds, the king of hard-core C geeks, compliment Visual Basic.

Discussion

Are You an XML Bozo?

Here's a helpful article that documents some common pitfalls to avoid when composing XML documents. Nobody wants to be called an XML Bozo by Tim Bray, the co-editor of the XML specification, right?

Bozo the clownThere seem to be developers who think that well-formedness is awfully hard -- if not impossible -- to get right when producing XML programmatically and developers who can get it right and wonder why the others are so incompetent. I assume no one wants to appear incompetent or to be called names. Therefore, I hope the following list of dos and don'ts helps developers to move from the first group to the latter.

  1. Don't think of XML as a text format
  2. Don't use text-based templates
  3. Don't print
  4. Use an isolated serializer
  5. Use a tree or a stack (or an XML parser)
  6. Don't try to manage namespace declarations manually
  7. Use unescaped Unicode strings in memory
  8. Use UTF-8 (or UTF-16) for output
  9. Use NFC
  10. Don't expect software to look inside comments
  11. Don't rely on external entities on the Web
  12. Don't bother with CDATA sections
  13. Don't bother with escaping non-ASCII
  14. Avoid adding pretty-printing white space in character data
  15. Don't use text/xml
  16. Use XML 1.0
  17. Test with astral characters
  18. Test with forbidden control characters
  19. Test with broken UTF-*

I'm a little ambivalent about XML, largely due to what John Lam calls "The Angle Bracket Tax". I think XSLT is utterly insane for anything except the most trivial of tasks, but I do like XPath-- it's sort of like SQL with automatic, joinless parent-child relationships.

But XML is generally the least of all available evils, and if you're going to use it, you might as well follow the rules.

Discussion

Windows XP, Our New Favorite Legacy Operating System

John Gruber gloats that Windows XP does not fare well in a comparison against OS X:

But everything about Boot Camp is calibrated to position Windows-on-Mac as the next Classic-style ghetto -- a compatibility layer that you might need but that you wish you didn't.

Even the Boot Camp logo:

Apple BootCamp Windows logo

reinforces this. It's a bastardized variant of Microsoft's Windows logo, sans color, and with the whitespace between the four panels forming a hidden "X", la the hidden arrow in the FedEx logo.

[Microsoft is] stuck with the fact that in a fair shoot-out, Mac OS X is better. It looks better, it's better designed, it's more exciting, more intriguing, more satisfying. Cf. this joke from an anonymous poster in the comments at Mini-Microsoft's weblog:

What's the difference between OS X and Vista?

Microsoft employees are excited about OS X…

What's conspicuously missing from this comparison is any mention of the fact that Windows XP was originally released in October 2001.

In the intervening five years, Apple's OS X has seen five major releases. If you squint your eyes, tilt your head, and look at it from a distance, perhaps you could consider Service Pack 2 a point release. But any way you slice it, Windows XP is going on five years old now. That's ancient. It's also the longest time Microsoft has ever gone between major releases of Windows.

Consider the minimum system requirements for Windows XP:

  • 233 MHz processor
  • 64 MB of RAM (128 MB recommended)
  • Super VGA (800 x 600) display
  • CD-ROM or DVD drive
  • Keyboard and mouse

The cost of a license to Windows XP is-- quite literally-- more expensive than purchasing a PC that meets these minimum specs today.

What Gruber doesn't realize is that relegating Windows XP to "Classic" status isn't an insult. It's simply acknowledging what every Windows user already knows: Windows XP is a legacy operating system.

microsoft-windows-xp-logo-greyscale.png

And there's no shame in it.

Look at the age of UNIX, which OS X is based on. In the same way that OS X is a modern remodelling of its BSD and Mach kernel origins, Windows Vista will be a much-needed modern renovation of the XP core.

But in the meantime, as the guys at Engadget recently said:

At this point we don't really know what to expect anymore, and since our current XP-powered setup already does everything we need it to, we're getting pretty close to not caring if Vista is ever released at all.

I'm perfectly content to use Windows XP "classic", as long as Windows Vista is on the horizon for 2007.

And there are other benefits to Windows XP's advanced age, too.

Since XP's minimum system requirements are absurdly low by today's standards, you'll have no problem running Windows XP-- even multiple instances of Windows XP-- in a virtual machine on a modern development PC. My optimized, fully-patched Windows XP SP2 Virtual Machine image is down to 587 megabytes. That's a mere 139 megabytes as a self-extracting RAR file.

Most apps run fine in Windows XP with 128 megabytes or 160 megabytes of memory. For example, here's a screenshot of IE 7, Beta 3. It's running in an optimized Windows XP virtual machine with only 128 megabytes of memory:

Windows XP SP2, in a 128mb VM, running IE7 SP3

That's with four tabs open to ESPN, eBay, Yahoo news, and MSN. Even with all that going on, I have more than 20 megabytes of free memory. And my commit charge total is well under the physical memory total. There's still room for more stuff!

Clearly, if all you need to do is test IE7 beta 3 in a virtual machine, a humble developer machine with 512 megs of memory will work fine.* Of course, you still need to be careful if you don't have a gigabyte or more of system memory. There are more detailed guidelines at the Virtual PC guy blog.

Here's the complete Task Manager process list for this VM, if you're curious.

A list of processes in Taskman for the IE7 virtual machine

I see a few services that could be disabled to free up even more memory.

* however, if you're working at a job where developers are expected to work on machines with less than 1 gigabyte of memory, it's definitely time to start looking for a new job.

Discussion