Coding Horror

programming and human factors

Everything I Needed to Know About Programming I Learned from BASIC

Edsger Dijkstra had this to say about Beginner's All Purpose Symbolic Instruction Code:

It is practically impossible to teach good programming style to students that have had prior exposure to BASIC; as potential programmers they are mentally mutilated beyond hope of regeneration.

I'm sure he was exaggerating here for effect; as much as I admire his 1972 "The Humble Programmer" paper, it's hard to square that humility with the idea that choosing the wrong programming language will damage the programmer's mind. Although computer languages continue to evolve, the largest hurdle I see isn't any particular choice of language, but the fact that programmers can write FORTRAN in any language. To quote Pogo, we have met the enemy, and he is us.

Dismissing BASIC does seem rather elitist. Like many programmers of a certain age, I grew up with BASIC.

I mentioned in an earlier post the curious collision of early console gaming and programming that was the Atari 2600 BASIC Programming cartridge. I had to see this for myself, so I bought a copy on eBay.

Atari 2600 basic programming cartridge

I also bought a set of the Atari 2600 keypad controllers. The overlays come with the cartridge, and the controllers mate together to make a primitive sort of keyboard. (Also, if you were wondering what kinds of things I do with my ad revenue, buying crap like this is a big part of it, sadly.)

Atari 2600 BASIC programming keypads

Surprisingly, the manual isn't available anywhere online, so I scanned it in myself. Take a look. It's hilarious. There is a transcribed HTML version of the manual, but it's much less fun to read without the pictures and diagrams.

I booted up a copy of the Basic Programming ROM in the Stella Atari 2600 emulator, then followed along with the manual and wrote a little BASIC program.

Atari 2600 BASIC Programming Screenshot

You'll notice that all the other screenshots of Atari 2600 Basic Programming on the web are essentially blank. That's probably because I'm the only person crazy enough to actually try programming in this thing. It may look painful, but you have no idea until you've tried to work with this funky "IDE". It's hilariously bad. I could barely stop laughing while punching away at my virtual keypads. But I have to confess, after writing my first "program", I got that same visceral little thrill of bending the machine to my will that I've always gotten.

The package I got from eBay included a few hand-written programming notes that I assume are from the 1980s.

Atari 2600 sample code

Isn't that what BASIC – even this horribly crippled, elephant man Atari 2600 version of BASIC – is all about? Discovering fundamental programming concepts?

Of course, if you were at all interested in computers, you wouldn't bother programming on a dinky Atari 2600. There were much better options for gaming and programming in the form of home computers. And for the longest time, every home computer you could buy had BASIC burned into the ROM. Whether it was the Apple //, Commodore 64, or the Atari 800, you'd boot up to be greeted by a BASIC prompt. It became the native language of the hobbyist programmer.

basic on the Apple // series

basic on the Atari 8-bit series

basic on the Commodore 64

Even the IBM PC had BASICA, GW-BASIC and finally QBasic, which was phased out with Windows 2000.

It's true that if you wanted to do anything remotely cutting-edge with those old 8-bit Apple, Commodore and Atari home computers, you had to pretty much learn assembly language. I don't recall any compiled languages on the scene until the IBM PC and DOS era, primarily Turbo Pascal. Compiled languages were esoteric and expensive until the great democratization of Turbo Pascal at its low, low price point of $49.99.*

Even if you lacked the programming skills to become the next David Crane or Will Wright, there were still a lot of interesting games and programs you could still write in good old BASIC. Certainly more than enough to figure out if you enjoyed programming, and if you had any talent. The Creative Computing compilations were like programming bibles to us.

BASIC Computer Games More BASIC Computer Games

For a long, long time, if you were interested in computers at all, you programmed in BASIC. It was as unavoidable and inevitable as the air you breathed. Every time you booted up, there was that command prompt blinking away at you. Why not type in some BASIC commands and see what happens? And then the sense of wonder, of possibility, of being able to unlock the infinitely malleable universe inside your computer. Thus the careers of millions of programmers were launched.

BASIC didn't mutilate the mind, as Dijkstra claimed. If anything, BASIC opened the minds of millions of young programmers. It was perhaps the earliest test to determine whether you were a programming sheep or a non-programming goat. Not all will be good, of course, but some inevitably will go on to be great.

Whether we're still programming in it or not, the spirit of BASIC lives on in all of us.

* as an aside, you may notice that Anders Hejlsberg was the primary author of Turbo Pascal and later Delphi; he's now a Technical Fellow at Microsoft and the chief designer of the C# language. That's a big reason why so many longtime geeks, such as myself, are so gung-ho about .NET.

Discussion

Should All Developers Have Manycore CPUs?

Dual core CPUs are effectively standard today, and for good reason -- there are substantial, demonstrable performance improvements to be gained from having a second CPU on standby to fulfill requests that the first CPU is too busy to handle. If nothing else, dual-core CPUs protect you from badly written software; if a crashed program consumes all possible CPU time, all it can get is 50% of your CPU. There's still another CPU available to ensure that the operating system can let you kill CrashyApp 5.80 SP1 Enterprise Edition in a reasonable fashion. It's the buddy system in silicon form.

My previous post on upgrading the CPU in your PC was more controversial than I intended. Here's what I wrote:

In my opinion, quad-core CPUs are still a waste of electricity unless you're putting them in a server. Four cores on the desktop is great for bragging rights and mathematical superiority (yep, 4 > 2), but those four cores provide almost no benchmarkable improvement in the type of applications most people use. Including software development tools.

It's unfortunate, because this statement overshadowed the rest of the post. All I wanted to do here is encourage people to make an informed decision in selecting a CPU. Really, pick any CPU you want; the important part of that post is being unafraid to upgrade your PC. Insofar as the above paragraph distracted readers from that goal, I apologize.

However, I do have strong feelings on this topic. All too often I see users seduced by Intel's marketing department, blindly assuming that if two CPU cores is faster than one CPU core, then, well.. four, eight, or sixteen must be insanely fast! And out comes their wallet. I fear that many users fall prey to marketing weasels and end up paying a premium for performance that, for them, will never materialize. It's like the bad old days of the Pentium 4 again, except for absurd megahertz clock speeds, substitute an absurd number of CPU cores.

I want people to understand that there are only a handful of applications that can truly benefit from more than 2 CPU cores, and they tend to cluster tightly around certain specialized areas. To me, it's all about the benchmark data, and the benchmarks just don't show any compelling reason to go quad-core unless you regularly do one of the following:

  • "rip" or encode video
  • render 3D scenes professionally
  • run scientific simulations

If you frequently do any of the above, there's no question that a quad-core (or octa-core) is the right choice. But this is merely my recommendation based on the benchmark data, not iron-clad fact. It's your money. Spend it how you like. All I'm proposing is that you spend it knowledgably.

Ah, but then there's the multitasking argument. I implored commenters who felt strongly about the benefits of quad-core to point me to multitasking benchmarks that showed a profound difference in performance between 2 and more-than-2 CPU cores. It's curious. The web is awash in zillions of hardware review websites, yet you can barely find any multitasking benchmarks on any of them. I think it's because the amount of multitasking required to seriously load more than two CPU cores borders on the absurd, as Anand points out:

When we were trying to think up new multitasking benchmarks to truly stress Kentsfield and Quad FX [quad-core] platforms we kept running into these interesting but fairly out-there scenarios that did a great job of stressing our test beds, but a terrible job and making a case for how you could use quad-core today.

What you will find, however, is this benchmarking refrain repeated again and again:

Like most of the desktop applications out there today, including its component apps, WorldBench doesn't gain much from more than two CPU cores.

That said, I think I made a mistake in my original statement. Software developers aren't typical users. Indeed, you can make a reasonable case that software developers are almost by definition edge conditions and thus they should seek out many-core CPUs, as Kevin said in the comments:

How would you suggest developers write applications (this is what we are, and what we do, right?) that can actually leverage 4, 8, etc... CPU cores if we are running solo or dual core systems? I put this right up there with having multiple monitors. Developers need them, and not just to improve productivity, but because they won't under stand just how badly their application runs across multiple monitors unless they actually use it. The same is true with multi-core CPUs.

I have two answers to this. One of them you probably won't like.

Let's start with the first one. I absolutely agree that it is important for software developers to consider multi-core software development, and owning one on their desktop is a prerequisite. I originally wrote about this way, way back in 2004 in Threading, Concurrency, and the Most Powerful Psychokinetic Explosive in the Universe. In fact, two of the people I quoted in that old article -- true leaders in the field of concurrent programming -- both posted direct responses to my article yesterday, and they deserve a response.

Rick Brewster, of the seriously amazing Paint.NET project, had this to say in a comment:

Huh? Paint.NET, for one, shows large gains on quad-core versus dual-core systems. There's even a benchmark. I'd say that qualifies as "applications most people use."

He's absolutely right. A quad-core Q6700 @ 2.66 GHz trounces my dual-core E8500 @ 4.0 GHz on this benchmark, to the tune of 26 seconds vs. 31 seconds. But with all due respect to Rick -- and seriously, I absolutely adore Paint.NET and his multithreading code is incredible -- I feel this benchmark tests specialized (and highly parallelizable) filters more than core functionality. There's a long history of Photoshop benchmarking along the same lines; it's the 3D rendering case minus one dimension. If you spend a significant part of your day in Photoshop, you should absolutely pick the platform that runs it fastest.

But we're developers, not designers. We spend all our time talking to compilers and interpreters and editors of various sorts. Herb Sutter posted an entire blog entry clarifying that, indeed, software development tools do take advantage of quad-core CPUs:

You must not be using the right tools. :-) For example, here are three I'm familiar with:
  1. Visual C++ 2008's /MP flag tells the compiler to compile files in the same project in parallel.
  2. Since Visual Studio 2005 we've supported parallel project builds in Batch Build mode
  3. Excel 2007 does parallel recalculation. Assuming the spreadsheet is large and doesn't just contain sequential dependencies between cells, it usually scales linearly up to at least 8 cores.

Herb is an industry expert on concurrent programming and general C++ guru, and of course he's right on all three counts. I had completely forgotten about C++ compilation, or maybe it's more fair to say I blocked it out. What do you expect from a guy with a BASIC lineage? Compilation time is a huge productivity drain for C++ developers working on large projects. Compilation time using gcc and time make -j<# of cores + 1> is the granddaddy of all multi-core programmer benchmarks. Here's a representative result for compiling the LAME 3.97 source:

1Xeon E5150 (2.66 GHz Dual-Core)12.06 sec
1Xeon E5320 (1.86 GHz Quad-Core)11.08 sec
2xXeon E51508.26 sec
2xXeon E53208.45 sec

The absolute numbers seem kind of small, but the percentages are incredibly compelling, particularly as you add up the number of times you compile every day. If you're a C++ developer, you need a quad-core CPU yesterday. Demand it.

But what about us managed code developers, with our lack of pointers and explicit memory allocations? Herb mentioned the parallel project builds setting in Visual Studio 2008; it's under Tools, Options, Projects and Solutions, Build and Run.

Visual Studio 2008 parallel project build settings

As promised, it's defaulting to the number of cores I have in my PC -- two. I downloaded the very largest .NET project I could think of off the top of my head, SharpDevelop. The solution is satisfyingly huge; it contains 60 projects. I compiled it a few times in Visual Studio 2008, but task manager wasn't showing much use of even my measly little two cores:

Visual Studio 2008 Compilation, Task Manager CPU time

I did see a few peaks above 50%, but it's an awfully tepid result compared to the make -j4 one. I see nothing here that indicates any kind of possible managed code compilation time performance improvement from moving to more than 2 cores. I'm sort of curious if Java compilers (or other .NET-like language compilers) do a better job of this.

Getting back to Kevin's question: yes, if you are a software developer writing a desktop application that has something remotely parallelizable in it, you should have whatever number of CPU cores on the desktop you need to test and debug your code. I suggest starting with a goal of scaling well to two cores, as that appears to be the most challenging part of the journey. Beyond that, good luck and godspeed, because everything I've ever read on the topic of writing scalable, concurrent software goes out of its way to explain in excruciating detail how hellishly difficult this kind of code is to write.

Here's the second part of the answer I promised you earlier. The one you might not like. Most developers aren't writing desktop applications today. They're writing web applications. Many of them may be writing in scripting languages that aren't compiled, but interpreted, like Ruby or Python or PHP. Heck, they're probably not even threaded. And yet this code somehow achieves massive levels of concurrency, scales to huge workloads, and drives some of the largest websites on the internet. All that, without thinking one iota about concurrency, threading, or reentrancy. It's sort of magical, if you think about it.

So in the sense that mainstream developers are modelling server workloads on their desktops, I agree, they do probably need as many cores as they can get.

Discussion

Building a PC, Part V: Upgrading

Last summer I posted a four part series on building your own PC:

 

My personal system is basically identical to that build, though it predates it by about six months. The only significant difference is the substitution of the Core 2 Duo E6600 CPU.

In my opinion, quad-core CPUs are still a waste of electricity unless you're putting them in a server. Four cores on the desktop is great for bragging rights and mathematical superiority (yep, 4 > 2), but those four cores provide almost no benchmarkable improvement in the type of applications most people use. Including software development tools. (Update: This paragraph was more controversial than intended. See Should All Developers Have Manycore CPUs? for a clarification.)

e6600-cpu-pic.jpg

My original advice stands: for the vast majority of users, the fastest possible dual-core CPU remains the best choice. I overclocked my E6600 CPU from 2.4 Ghz to 3.2 Ghz, instantly increasing the value of the processor by about 800 bucks.

Beyond overclocking, the economy of building your own PC also lies in upgrading it in pieces and parts to keep it up to date. Once you've taught yourself to build a PC, swapping parts out is easy. That's an option you almost never have on laptops, and rarely on commercial desktops.

It's been almost a year and a half since I made any significant change to my PC build. That's an eternity in computer dog years. I was developing a serious itch to upgrade something -- anything -- on my PC. I did a bit of research, and I was surprised to find that the P965 chipset on my Asus P5B Deluxe motherboard supports the latest and greatest Intel CPUs. This is a pleasant surprise indeed; Intel and AMD change the pinouts and sockets of their CPUs quite regularly. A simple CPU upgrade, more often than not, forces a complete motherboard and memory upgrade. But not in this case!

So here's what I did:

  1. flash the BIOS* on my motherboard to the latest version, which supports the newest CPUs
  2. remove the old and busted CPU (Core 2 Duo E6600, 2.4 GHz, 4 MB L2)
  3. drop in the new hotness CPU (Core 2 Duo E8500, 3.16 GHz, 6 MB L2)
  4. Manually adjust FSB speed, memory voltage and CPU voltage

This chip is an outstanding overclocker. It's almost a no-brainer. The tubes are full of documented cases of this chip reaching 4.5 GHz and sometimes higher. I was fairly content with my effortless 4 GHz overclock:

cpu-z E8500 @ 4 GHz

If you're wondering why CPU-Z says this is a 2520 MHz CPU instead of the 4000 MHz you'd expect, that's because the CPU is idle. All modern CPUs clock down at idle to reduce power draw. If you run something CPU intensive, you'll see the CPU speed dynamically change in CPU-Z, as illustrated by this animated GIF:

CPU-Z SpeedStep animation

This power savings is achieved by dropping the CPU multiplier from its default of 9.5 down to 6.0. If we do a little math, it's easy to infer the relationship between FSB (front side bus), CPU multiplier, and actual CPU speed:

 

315 MHz 6.0x 1890 MHz
333 MHz 9.5x 3163 MHz
420 MHz 6.0x 2520 MHz
420 MHz 9.5x 3990 MHz

 

Overclocking the CPU is simple if you can stumble your way through a few basic BIOS screens. The default voltage on this E8500 is 1.128 volts. By juicing the CPU voltage up to 1.36 volts, and setting the front side bus (FSB) to 420 MHz, we can hit the magical 4 GHz number. All we need to do is a little unit testingburn-in torture testing, and we can confirm that it's stable.

But you might wonder -- does this overclocking stuff really justify the hassle? Is going from 3.0 GHz to 4.0 GHz really worth it in terms of actual performance and not just bragging rights?

I'm glad you asked!

I clocked my E8500 to 3.0 GHz / 315 FSB and 4.0 GHz / 420 FSB and ran a few quick SunSpider JavaScript benchmarks. You may remember this great little benchmark from The Great Browser JavaScript Showdown. Here's what I found:

JavaScript Sunspider CPU Performance from 3 GHz to 4 GHz

And the overall benchmark result in table form:

 

  3 GHz 4 GHz  
Internet Explorer 7 SP1 15,824 ms 12,748 ms 19% faster
Firefox 3.0 Beta 5 3,018 ms 2,450 ms 19% faster

 

That's a consistent 19% performance improvement in an interpreted browser language for a 33% increase in raw CPU clock speed. Not too shabby. It's actually more than I expected. The real speed difference between an E6600 and E8500 would be (slightly) greater than the pure clock speed indicates, due to the architectural improvements and larger L2 cache in the E8500. There also might be other languages and apps that scale more linearly with that 33% CPU clock speed increase.

Compare the result of going from 3 GHz to 4 GHz with adding another two cores, which would produce exactly zero improvement in your JavaScript benchmarks. Most apps are barely multithreaded, much less capable of taking advantage of all four cores. Having four CPU cores won't help you much when they're all poking along at a leisurely 2 GHz.

So if you followed our original PC build plan, or if you're planning to build your own PC -- don't forget to factor upgrading into your system's lifespan! These builds are eminently upgradeable. Sometimes you'll get lucky and have knockout upgrade options like the E8500: a 4 GHz (almost) guaranteed drop-in CPU replacement for under 300 bucks.

* I am simplifying a little because I don't want to scare anyone. In the interests of full disclosure, here's the story. The ASUS Windows x64 BIOS flash program crashed while updating the motherboard BIOS. I can't quite describe the chill that went down my spine as I watched this happen. Any failure during a BIOS flash is irrevocable and permanent, the very definition of "bricking". To be fair, this is literally the first time I've ever bricked anything in at least 10 years of regular yearly BIOS flashing. I had to buy another motherboard and initiate a RMA on my original, newly BIOS-free motherboard. Let this be a lesson to you, kids: don't trust Windows software developers! Always update the BIOS from a boot CD or from within the BIOS itself using a USB key!

Discussion

Introducing Stackoverflow.com

A little over a month ago, I announced that I was quitting my job. But there was also something else I didn't fully announce.

But I refuse to become a full-time blogger. I think that's a cop-out. If I look at the people I respect most in the industry, the people I view as role models-- Paul Graham, Joel Spolsky, Steve Yegge, Eric Sink, Rich Skrenta, Marc Andreesen, Wil Shipley, Douglas Crockford, Scott Guthrie -- they all have one thing in common. They're not just excellent writers and communicators. They build stuff, too. The world has enough vapid commentary blogs. I want to build stuff-- and talk about it. I have a little micro-ISV startup opportunity I'll be working on, a web property I'm building out with one of the above people. I'm not ready to announce the details yet, but when I do, you'll read about it here.

The "building stuff", as you helped us determine, is stackoverflow.com. It's a small company Joel Spolsky and I are founding together.

If you've been reading my blog for a while, you might find this pairing strange. It's true that I've been critical of Joel in the past. And it is sort of funny that I own the number one image search result and a top 10 search result for Joel Spolsky. Good thing Joel has a sense of humor.

Occasionally I'll meet readers, or get emails from readers, who tell me that they enjoy my blog... and oh-by-the-way they strongly disagree with a few things I've said. Their phrasing clearly implies that they think there's something wrong with this. Well, there isn't. I'm here to tell you that occasional disagreement is healthy and normal. If you agree with everything I write here, why would you bother reading? At that point, we're the same person. I distrust people who agree with me all the time. I want someone to push back and encourage me to question my assumptions.

I admire what Joel has created. He was one of the earliest programming bloggers, and certainly one of the first I found that helped me realize the kind of positive influence writing could have on my fellow programmers. He is very much living the dream: he founded a company with the express intent of not cashing out with VC money, but creating a sustainible place where programmers can have fun while programming useful stuff. It's an honor to have the opportunity to work closely with Joel, and to combine the collective power of our two communities.

So what is stackoverflow?

From day one, my blog has been about putting helpful information out into the world. I never had any particular aspirations for this blog to become what it is today; I'm humbled and gratified by its amazing success. It has quite literally changed my life. Blogs are fantastic resources, but as much as I might encourage my fellow programmers to blog, not everyone has the time or inclination to start a blog. There's far too much great programming information trapped in forums, buried in online help, or hidden away in books that nobody buys any more. We'd like to unlock all that. Let's create something that makes it easy to participate, and put it online in a form that is trivially easy to find.

Are you familiar with the movie pitch formula?

Stackoverflow is sort of like the anti-experts-exchange (minus the nausea-inducing sleaze and quasi-legal search engine gaming) meets wikipedia meets programming reddit. It is by programmers, for programmers, with the ultimate intent of collectively increasing the sum total of good programming knowledge in the world. No matter what programming language you use, or what operating system you call home. Better programming is our goal.

Of course, there's more to it than that. Joel and I are recording our weekly calls and releasing them as podcasts. Listen to us describe our vision for stackoverflow in our own words -- just head over to stackoverflow.com to download the first 46 minute episode. We're even taking questions, if you submit them in the form of audio recordings.

Discussion

Your Session Has Timed Out

How many times have you returned to your web browser to be greeted by this unpleasant little notification:

Your session has timed out. Please sign in again.

If you're anything like me, the answer is lots. What's worse is that you're usually kicked out of whatever page context you were working in. You have to manually log in again, remember what you were doing, then navigate back to where you were and resume your work.

Most programmers look at these sort of browser session timeouts as a necessary evil -- sometimes even as a security "feature". I know my bank website zealously logs me out of its web interface if I'm idle for more than five minutes. I'm not sure either one of these reasons are particularly justifiable.

As a programmer, I understand why session expiration occurs. The HTTP protocol that the web is built on is stateless. That means every individual request your browser sends to a web server is a newborn babe, cruelly born into a world that is utterly and completely oblivious to its existence. The way modern web applications get around this is by telling the browser to send a small, unique value back to the website with each request -- this is known as a HTTP cookie. It sounds a lot tastier than it looks:

Content-type: text/html
Cookie: SessionId=5451297120

While there are privacy concerns with cookies, it is a generally accepted practice today -- at least for the first-party cookie flavors. While it is possible to maintain state without cookies, it's painful and awkward.

Every web request to that server will include its own cookie and associated session id until it expires, usually many months or even years hence. The browser definitely isn't the forgetful party here.

It's up to the server to correlate the unique session identifier sent by the browser with your individual identity, context, settings, and preferences. This is usually stored in a database of some kind, keyed by your session identifier. For performance reasons, some chunk of session information also ends up in the server's memory; there's no need to reach all the way out to the database the next twenty-six times you obsessively refresh your Facebook profile page.

Still, that doesn't explain why the web server mysteriously forgets about us. If anything, the server has all the information it needs to remember you, even if you walked away from your computer for a week. So why does the server choose to arbitrarily forget about you in an hour?

  1. Performance. Consider a highly trafficked web site. If the website tried to keep sessions alive for an entire month, that could cause the session table to grow to millions of records. It's even worse if you think about it in terms of user information cached in memory; a measly few kilobytes of memory state per user doesn't sound like much, but multiplied by a few million, it absolutely is. If this data wasn't expired and dumped on some schedule, it would quickly blow up the web server.

  2. Security. The magic cookie that stores your session can potentially be stolen. If that cookie never expires, you have an infinitely long vulnerability window to session hijacking. This is serious stuff, and mitigation strategies are limited. The best option, short of encrypting the entire connection from end to end via HTTPS, is to keep a tight expiration window on the session cookie, and regenerate them frequently.

That's the why of browser session timeouts from the programmer's perspective. But that doesn't make it right. Far from it.

As a user, I can say pretty unequivocally that session expiration sucks. Is it really so unreasonable to start doing something in your web browser, walk away for an hour -- maybe even for a few hours -- then come back and expect things to just work?

As programmers, I think we can do better. It is possible. I am inundated with session timeout messages every day from a variety of sources, but I've never once seen a session expiration message from gmail, for example. Here's what I suggest:

  1. Create a background JavaScript process in the browser that sends regular heartbeats to the server. Regenerate a new cookie with timed expiration, say, every 5 or 10 minutes.

  2. If you're worried about session hijacking -- and you really should be -- use a HTTPS protected connection. This is an absolute no-brainer for financial institutions of any kind.

I wish more developers would test their web applications for session timeout issues. Despite all rumors to the contrary, your users will not be dedicating their entire lives to using your web application in a punctual and timely manner. They have phone calls to take, meetings to go to, other websites and applications to attend to.

Is it really fair to kick users all the way out of your web application, or worse, blindly reject data they've submitted -- just because they were impudent enough to wait a few hours since their last supplication to the web server gods? In most web apps, the penance is awfully severe for such a common sin.

Discussion