Coding Horror

programming and human factors

Revisiting the XML Angle Bracket Tax

Occasionally I'll write about things that I find sort of mildly, vaguely thought provoking, and somehow that writing turns out to be ragingly controversial once posted here. Case in point, XML: The Angle Bracket Tax. I'm still encountering people online who almost literally hate my guts because I wrote that post. You'd think I kicked their dog, or made inappropriate romantic overtures toward their significant other.

Well, first of all, we are talking about XML the markup language, not XML the religion, right?

I hope so. I try not to get emotionally involved with the tools and technologies that I use, if I can avoid it. This doesn't mean I can't be enthusiastic or critical of those tools and technologies, but I'm not married to the stuff either way. Who needs all the emotional baggage?

Obviously I failed to communicate this before. I talked about this a little bit on Stack Overflow podcast #5 with Joel, where I tried to amplify and explain my position a little better.

I wasn't trying to present it as "Oh, XML is bad, let's all switch to this new markup language that all the cool guys are using". What I was trying to say is why don't we think about what we're doing? That's the general theme of a lot of the stuff in my blog. Can we just stop programming for a minute to think about what we're doing and not make a blind choice based on "Well this is what my tool does, so that's what I have to do"?

I think obviously there's pros and cons to each. I'm not saying that one is the right solution all the time. But I think, ironically, that is what is happening with XML. I think people are saying "It's always the right answer, because it can store anything, right? And all the stuff I use uses it, so it must be the right choice for everything." That bothers me a little. Maybe I'm just contrarian. Maybe I'm an iconoclast and I want to try different things and see different things, but I think actually understanding the alternatives helps you understand XML better, a little bit, too.

And I hope people reading my blog would not get the idea that it's about a knee-jerk reaction one way or the other. It's about understanding the tradeoffs and applying those tradeoffs to your particular situation. I think that is the absolute art of programming. It's understanding what you could do, and which one of those things fits your situation best. Versus what so many programmers do, which is "I've learned to use a hammer, and I'm gonna hammer everything." Ultimately, to me, it's about self-awareness.

By the way, I'd like to thank everyone who pitches in to make those Stack Overflow podcast transcriptions possible. It is because of your generously donated time that I am able to quote that audio here.

I don't post stuff to push people's buttons, I post it because I want programmers to think about their tools, their technologies, their methods.

Think IBM placards, taken at the Computer History Museum

If what I post here seems unnecessarily confrontational sometimes, a far smarter person than myself said it better than I can:

I blog to help others and also to learn. As it turns out both are aided by getting folks to actually read the stuff. Please pardon the necessary devices.

Please do pardon the necessary devices; I find that I often learn best through the smackdown learning model. That works for me. Maybe it doesn't work for you, and that's OK. There are millions of websites to choose from.

That said, I do actually have a problem with XML, or I wouldn't have written anything in the first place. I think there's a real issue here that is, for the most part, being completely ignored. XML fever may not be as debilitating as, say, Dengue fever, but it has side effects as well.

Consider Norman Walsh's Defending the Tax. Norman is an XML Standards Architect at Sun.

On the other hand, the difference between:

fruit=pear
vegetable=carrot
topping=wax

and

<doc>
<fruit>pear</fruit>
<vegetable>carrot</vegetable>
<topping>wax</topping>
</doc>

isn't really that large, is it? (Or maybe you think it is, de gustibus non est disputandum.)

The de gustibus dismissal means Norman considers it is a matter of taste, but it isn't. The difference is large. There is a very real mental cost to parsing even a few short lines of XML.

As a Visual Studio ecosystem programmer, XML is pervasive, in every nook and cranny of a project. Every time I look at my web.config XML file, there's a mental cost of me having to parse all these tags in the file. Here's this tag, which lines up with this tag. Here's this giant, verbose thing where only half of it actually matters.

Sure, it's a small effort. Insignificant, even. But what's the mental cost of that insignificant effort times the number of developers in the world, times the number of projects in the world?

I also posit that these minor headaches may be more significant than you realize. In Stumbling on Happiness, author Dan Gilbert makes a similar assertion.

Stumbling on Happiness

His research found that people are bad at predicting their own future happiness. They tend to radically overestimate the positive or negative impact of large events in their lives – losing your job, getting rich, getting divorced, having children. That's generally good; it means we have defense mechanisms in place to adapt and survive in our changing circumstances as human beings. But, we also tend to radically underestimate the impact of the dozens of small events in our lives throughout the day. Thus, small injustices don't trigger our defenses. The effect of that squeaky screen door, the neighbor's barking dog, the interrupting telephone call – all of these may have far more profound cumulative impact on your day to day happiness than you realize.

It's a fascinating book, and I'm only paraphrasing the smallest part of it. I highly recommend reading it if this is at all interesting to you. It won't exactly unlock the secrets to happiness, I'm afraid, but you may gain a deeper understanding of why we tend to make the choices we do in our neverending pursuit of happiness.

I'm not trying to change the world overnight, but I wouldn't mind planting a few seeds of dissent in people's minds. This small stuff matters.

The next time you're trying to figure out an XML file, just think about it.

That's all I'm saying.

Discussion

The Ultimate Code Kata

As I was paging through Steve Yegge's voluminous body of work recently, I was struck by a 2005 entry on practicing programming:

Contrary to what you might believe, merely doing your job every day doesn't qualify as real practice. Going to meetings isn't practicing your people skills, and replying to mail isn't practicing your typing. You have to set aside some time once in a while and do focused practice in order to get better at something.

I know a lot of great engineers -- that's one of the best perks of working at Amazon -- and if you watch them closely, you'll see that they practice constantly. As good as they are, they still practice. They have all sorts of ways of doing it, and this essay will cover a few of them.

The great engineers I know are as good as they are because they practice all the time. People in great physical shape only get that way by working out regularly, and they need to keep it up, or they get out of shape. The same goes for programming and engineering.

It's an important distinction. I may drive to work every day, but I'm far from a professional driver. Similarly, programming every day may not be enough to make you a professional programmer. So what can turn someone into a professional driver or programmer? What do you do to practice?

The answer lies in the Scientific American article The Expert Mind:

Ericsson argues that what matters is not experience per se but "effortful study," which entails continually tackling challenges that lie just beyond one's competence. That is why it is possible for enthusiasts to spend tens of thousands of hours playing chess or golf or a musical instrument without ever advancing beyond the amateur level and why a properly trained student can overtake them in a relatively short time. It is interesting to note that time spent playing chess, even in tournaments, appears to contribute less than such study to a player's progress; the main training value of such games is to point up weaknesses for future study.

Effortful study means constantly tackling problems at the very edge of your ability. Stuff you may have a high probability of failing at. Unless you're failing some of the time, you're probably not growing professionally. You have to seek out those challenges and push yourself beyond your comfort limit.

Those challenges can sometimes be found on the job, but they don't have to be. Separating the practicing from the profession is often referred to as code kata.

Katas illustration

The concept of kata, a series of choreographed practice movements, is borrowed from the martial arts.

If you're looking for some examples of code kata -- ways to practice effortful study and hone your programming skills -- Steve's article has some excellent starting points. He calls them practice drills:

  1. Write your resume. List all your relevant skills, then note the ones that will still be needed in 100 years. Give yourself a 1-10 rating in each skill.

  2. Make a list of programmers who you admire. Try to include some you work with, since you'll be borrowing them for some drills. Make one or two notes about things they seem to do well -- things you wish you were better at.

  3. Go to Wikipedia's entry for computer science, scroll down to the "Prominent pioneers in computer science" section, pick a person from the list, and read about them. Follow any links from there that you think look interesting.

  4. Read through someone else's code for 20 minutes. For this drill, alternate between reading great code and reading bad code; they're both instructive. If you're not sure of the difference, ask a programmer you respect to show you examples of each. Show the code you read to someone else, and see what they think of it.

  5. Make a list of your 10 favorite programming tools: the ones you feel you use the most, the ones you almost couldn't live without. Spend an hour reading the docs for one of the tools in your list, chosen at random. In that hour, try learn some new feature of the tool that you weren't aware of, or figure out some new way to use the tool.

  6. Pick something you're good at that has nothing to do with programming. Think about how the professionals or great masters of that discipline do their practice. What can you learn from them that you can apply to programming?

  7. Get a pile of resumes and a group of reviewers together in a room for an hour. Make sure each resume is looked at by at least 3 reviewers, who write their initials and a score (1-3). Discuss any resumes that had a wide discrepancy in scoring.

  8. Listen in on a technical phone screen. Write up your feedback afterwards, cast your vote, and then talk about the screen with the screener to see if you both reached the same conclusions.

  9. Conduct a technical interview with a candidate who's an expert in some field you don't know much about. Ask them to explain it to you from the ground up, assuming no prior knowledge of that field. Try hard to follow what they're saying, and ask questions as necessary.

  10. Get yourself invited to someone else's technical interview. Listen and learn. Try to solve the interview questions in your head while the candidate works on them.

  11. Find a buddy for trading practice questions. Ask each other programming questions, alternating weeks. Spend 10 or 15 minutes working on the problem, and 10 or 15 minutes discussing it (finished or not.)

  12. When you hear any interview coding question that you haven't solved yourself, go back to your desk and mail the question to yourself as a reminder. Solve it sometime that week, using your favorite programming language.

What I like about Steve's list is that it's somewhat holistic. When some developers think "practice" they can't get beyond code puzzles. But to me, programming is more about people than code, so there's a limit to how much you can grow from solving every obscure programming coding interview problem on the planet.

I also like Peter Norvig's general recommendations for effortful study outlined in Teach Yourself Programming in Ten Years.

  1. Talk to other programmers. Read other programs. This is more important than any book or training course.

  2. Program! The best kind of learning is learning by doing.

  3. Take programming classes at the college or graduate level.

  4. Seek out and work on projects with teams of programmers. Find out what it means to be the best programmer on a project -- and the worst.

  5. Work on projects after other programmers. Learn how to maintain code you didn't write. Learn how to write code so other people can effectively maintain it.

  6. Learn different programming languages. Pick languages that have alternate worldviews and programming models unlike what you're used to.

  7. Understand how the hardware affects what you do. Know how long it takes your computer to execute an instruction, fetch a word from memory (with and without a cache miss), transfer data over ethernet (or the internet), read consecutive words from disk, and seek to a new location on disk.

You can also glean some further inspiration from Pragmatic Dave's 21 Code Katas, or maybe you'd like to join a Coding Dojo in your area.

I don't have a long list of effortful study advice like Steve and Peter and Dave do. I'm far too impatient for that. In fact, there are only two movements in my book of code kata:

  1. Write a blog. I started this blog in early 2004 as a form of effortful study. From those humble beginnings it has turned into the most significant thing I've ever done in my professional life. So you should write blogs, too. The people who can write and communicate effectively are, all too often, the only people who get heard. They get to set the terms of the debate.

  2. Actively participate in a notable open source project or three. All the fancy blah blah blah talk is great, but are you a talker or a doer? This is critically important, because you will be judged by your actions, not your words. Try to leave a trail of public, concrete, useful things in your wake that you can point to and say: I helped build that.

When you can write brilliant code and brilliant prose explaining that code to the world -- well, I figure that's the ultimate code kata.

Discussion

Department of Declaration Redundancy Department

I sometimes (often, actually) regress a few years mentally and forget to take advantage of new features afforded by the tools I'm using. In this case, we're using the latest and greatest version of C#, which offers implicitly typed local variables. While working on Stack Overflow, I was absolutely thrilled to be able to refactor this code:

StringBuilder sb = new StringBuilder(256);
UTF8Encoding e = new UTF8Encoding();
MD5CryptoServiceProvider md5 = new MD5CryptoServiceProvider();

Into this:

var sb = new StringBuilder(256);
var e = new UTF8Encoding();
var md5 = new MD5CryptoServiceProvider();

It's not dynamic typing, per se; C# is still very much a statically typed language. It's more of a compiler trick, a baby step toward a world of Static Typing Where Possible, and Dynamic Typing When Needed.

This may be a cheap parlor compiler trick, but it's a welcome one. While writing C# code, I sometimes felt like I had entered the Department of Redundancy Department.

department of redundancy department.

Sure, there are times when failing to explicitly declare the type of an object can hurt the readability and maintainability of your code. But having the option to implicitly declare type can be a huge quality of life improvement for everyday coding, too.

There's always a tradeoff between verbosity and conciseness, but I have an awfully hard time defending the unnecessarily verbose way objects were typically declared in C# and Java.

BufferedReader br = new BufferedReader (new FileReader(name));

Who came up with this stuff?

Is there really any doubt what type of the variable br is? Does it help anyone, ever, to require another BufferedReader on the front of that line? This has bothered me for years, but it was an itch I just couldn't scratch. Until now.

If that makes sense to you, why not infer more fundamental data types, too?

var url = "http://tinyurl.com/5pfvvy";
var maxentries = 5;
var pi = 3.14159;
var n = new int[] {1, 2, 3};

I use implicit variable typing whenever and wherever it makes my code more concise. Anything that removes redundancy from our code should be aggressively pursued -- up to and including switching languages.

You might even say implicit variable typing is a gateway drug to more dynamically typed languages. And that's a good thing.

Discussion

Coding For Violent Psychopaths

Today's rumination is not for the weak of heart. It's from the venerable C2 Wiki page Code For The Maintainer:

Always code as if the person who ends up maintaining your code is a violent psychopath who knows where you live.

carny

Perhaps a little over the top, but maybe that shock to the system is what we need to get this important point across to our fellow developers.

If scare tactics don't work, hopefully you can develop a grudging respect for the noble art of maintenance programming over time. It may not be glamorous, but it's 99% of the coding work in this world.

Discussion

Physics Based Games

I've always been fascinated by physics-based gameplay. Even going back to the primeval days of classic arcade gaming, I found vector-based games, with their vastly simplified 2D approximations of physics and motion, more compelling than their raster brethren. I'm thinking of games like Asteroids, Battlezone, and Lunar Lander.

Accurately simulating the physics of the real world has been the domain of supercomputers for decades. The simulation of even "simple" physical phenomena like fire, smoke, and water requires a staggering amount of math. Now that we almost have multicore supercomputers on every desktop, it's only natural that aspect of computing would trickle down to us.

This topic is particularly relevant in light of today's introduction of NVIDIA's newest video card, the GTX 280, which contains a whopping 1.4 billion transistors. That's a lot. For context and scale, here's a shot of the 280 GPU next to a modern Intel dual-core CPU.

gtx-280-vs-penryn.jpg

I've talked about this before in CPU vs. GPU, but it bears repeating: some of the highest performing hardware in your PC lies on your video card. At least for a certain highly parallelizable set of tasks.

We were able to compress our test video (400 MB) in iPhone format (640*365) at maximum quality in 56.5 seconds on the 260 GTX and 49 seconds on the 280 GTX (15% faster). For comparison purposes, the iTunes H.264 encoder took eight minutes using the CPU (consuming more power overall but significantly less on peaks).

While one of the primary benefits of manycore CPUs is radically faster video encoding, let's put this in context -- compared to the newest, speediest quad core CPU, you can encode video ten times faster using a modern video card GPU. It's my hope that CUDA, Microsoft's Accelerator, and Apple's Grand Central/OpenCL will make this more accessible to a wide range of software developers.

All this physics horsepower, whether it's coming from yet another manycore x86 CPU, or a massively parallel GPU, is there for the taking. There are quite a few physics engines available to programmers:

There are no shortage of physics games and sandboxes to play with this stuff, too. Here are a few of my favorites.

Perhaps the most archetypal physics based game is Chronic Logic's Bridge Construction Set, the original version of which dates way back to 1999. I'm showing a picture of their fancy NVIDIA branded version below, but it's hardly about the graphics. This is pure physics simulation at its most entertaining. Who knew civil engineering could be so much fun? Highly recommended.

Bridge It! screenshot

Oh, and small hint: after playing this game, you will learn to love the power and beauty of the simple triangle. You'll also marvel at the longer bridges you manage to drive across without plunging into the watery abyss underneath.

I've professed my love for The Incredible Machine and other Rube Goldberg devices before. The physics based game Armadillo Run is a modern iteration of same. Get the armadillo from point A to point B using whatever gizmos and gadgets you find in your sandbox -- rendered in glorious 3D with a full-blown 2D physics engine in the background.

armadillo run screenshot

The latest physics based game to generate a lot of buzz is Trials 2: Second Edition. I haven't had a chance to try it yet, but the gameplay movie is extremely impressive. Like Armadillo run, the action is all on a 2D plane, but the physics are impeccable.

Trials 2: Second Edition screenshot

I'm sure I've forgotten a few physics based games here; peruse this giant list of physics games to see if your favorite is already included.

See, physics can be fun -- and increasingly complex physics engines are an outstanding way to harness the massive computational horsepower that lies dormant in most modern PCs.

Discussion