Coding Horror

programming and human factors

The Hot/Crazy Solid State Drive Scale

As an early advocate of solid state hard drives …

… I feel ethically and morally obligated to let you in on a dirty little secret I've discovered in the last two years of full time SSD ownership. Solid state hard drives fail. A lot. And not just any fail. I'm talking about catastrophic, oh-my-God-what-just-happened-to-all-my-data instant gigafail. It's not pretty.

I bought a set of three Crucial 128 GB SSDs in October 2009 for the original two members of the Stack Overflow team plus myself. As of last month, two out of three of those had failed. And just the other day I was chatting with Joel on the podcast (yep, it's back), and he casually mentioned to me that the Intel SSD in his Thinkpad, which was purchased roughly around the same time as ours, had also failed.

Portman Wills, friend of the company and generally awesome guy, has a far scarier tale to tell. He got infected with the SSD religion based on my original 2009 blog post, and he went all in. He purchased eight SSDs over the last two years … and all of them failed. The tale of the tape is frankly a little terrifying:

  • Super Talent 32 GB SSD, failed after 137 days
  • OCZ Vertex 1 250 GB SSD, failed after 512 days
  • G.Skill 64 GB SSD, failed after 251 days
  • G.Skill 64 GB SSD, failed after 276 days
  • Crucial 64 GB SSD, failed after 350 days
  • OCZ Agility 60 GB SSD, failed after 72 days
  • Intel X25-M 80 GB SSD, failed after 15 days
  • Intel X25-M 80 GB SSD, failed after 206 days

You might think after this I'd be swearing off SSDs as unstable, unreliable technology. Particularly since I am the world's foremost expert on backups.

Well, you'd be wrong. I just went out and bought myself a hot new OCZ Vertex 3 SSD, the clear winner of the latest generation of SSDs to arrive this year. Storage Review calls it the fastest SATA SSD we've seen.

Beta firmware or not though, the Vertex 3 is a scorcher. We'll get into the details later in the review, but our numbers show it as clearly the fastest SATA SSD to hit our bench.

ocz-vertex-3

While that shouldn't be entirely surprising, it's not just faster like, "Woo, it edged out the prior generation SF-1200 SSDs, yeah!" It's faster like, "Holy @&#% that's fast," boasting 69% faster results in some of our real-world tests.

Solid state hard drives are so freaking amazing performance wise, and the experience you will have with them is so transformative, that I don't even care if they fail every 12 months on average! I can't imagine using a computer without a SSD any more; it'd be like going back to dial-up internet or 13" CRTs or single button mice. Over my dead body, man!

It may seem irrational, but … well, I believe the phenomenon was explained best on the television show How I Met Your Mother by Barney Stinson, a character played brilliantly by geek favorite Neil Patrick Harris:

Barney: There's no way she's above the line on the 'hot/crazy' scale.

Ted: She's not even on the 'hot/crazy' scale; she's just hot.

Robin: Wait, 'hot/crazy' scale?

Barney: Let me illustrate!

The-hot-crazy-scale1

Barney: A girl is allowed to be crazy as long as she is equally hot. Thus, if she's this crazy, she has to be this hot. You want the girl to be above this line. Also known as the 'Vickie Mendoza Diagonal'. This girl I dated. She played jump rope with that line. She'd shave her head, then lose 10 pounds. She'd stab me with a fork, then get a boob job. [pause] I should give her a call.

Thing is, SSDs are so scorching hot that I'm willing to put up with their craziness. Consider that just in the last two years, their performance has doubled. Doubled! And the latest, fastest SSDs can even saturate existing SATA interfaces; they need brand new 6 Gbps interfaces to fully strut their stuff. No CPU or memory upgrade can come close to touching that kind of real world performance increase.

Just make sure you have a good backup plan if you're running on a SSD. I do hope they iron out the reliability kinks in the next 2 generations … but I've spent the last two months checking out the hot/crazy solid state drive scale in excruciating detail, and trust me, you want one of these new Vertex 3 SSDs right now.

Discussion

Working with the Chaos Monkey

Late last year, the Netflix Tech Blog wrote about five lessons they learned moving to Amazon Web Services. AWS is, of course, the preeminent provider of so-called "cloud computing", so this can essentially be read as key advice for any website considering a move to the cloud. And it's great advice, too. Here's the one bit that struck me as most essential:

We’ve sometimes referred to the Netflix software architecture in AWS as our Rambo Architecture. Each system has to be able to succeed, no matter what, even all on its own. We’re designing each distributed system to expect and tolerate failure from other systems on which it depends.

If our recommendations system is down, we degrade the quality of our responses to our customers, but we still respond. We’ll show popular titles instead of personalized picks. If our search system is intolerably slow, streaming should still work perfectly fine.

One of the first systems our engineers built in AWS is called the Chaos Monkey. The Chaos Monkey’s job is to randomly kill instances and services within our architecture. If we aren’t constantly testing our ability to succeed despite failure, then it isn’t likely to work when it matters most – in the event of an unexpected outage.

Which, let's face it, seems like insane advice at first glance. I'm not sure many companies even understand why this would be a good idea, much less have the guts to attempt it. Raise your hand if where you work, someone deployed a daemon or service that randomly kills servers and processes in your server farm.

Now raise your other hand if that person is still employed by your company.

Who in their right mind would willingly choose to work with a Chaos Monkey?

Angry-monkey-family-guy

Sometimes you don't get a choice; the Chaos Monkey chooses you. At Stack Exchange, we struggled for months with a bizarre problem. Every few days, one of the servers in the Oregon web farm would simply stop responding to all external network requests. No reason, no rationale, and no recovery except for a slow, excruciating shutdown sequence requiring the server to bluescreen before it would reboot.

We spent months -- literally months -- chasing this problem down. We walked the list of everything we could think of to solve it, and then some:

  • swapping network ports
  • replacing network cables
  • a different switch
  • multiple versions of the network driver
  • tweaking OS and driver level network settings
  • simplifying our network configuration and removing TProxy for more traditional X-FORWARDED-FOR
  • switching virtualization providers
  • changing our TCP/IP host model
  • getting Kernel hotfixes and applying them
  • involving high-level vendor support teams
  • some other stuff that I've now forgotten because I blacked out from the pain

At one point in this saga our team almost came to blows because we were so frustrated. (Well, as close to "blows" as a remote team can get over Skype, but you know what I mean.) Can you blame us? Every few days, one of our servers -- no telling which one -- would randomly wink off the network. The Chaos Monkey strikes again!

Even in our time of greatest frustration, I realized that there was a positive side to all this:

  • Where we had one server performing an essential function, we switched to two.
  • If we didn't have a sensible fallback for something, we created one.
  • We removed dependencies all over the place, paring down to the absolute minimum we required to run.
  • We implemented workarounds to stay running at all times, even when services we previously considered essential were suddenly no longer available.

Every week that went by, we made our system a tiny bit more redundant, because we had to. Despite the ongoing pain, it became clear that Chaos Monkey was actually doing us a big favor by forcing us to become extremely resilient. Not tomorrow, not someday, not at some indeterminate "we'll get to it eventually" point in the future, but right now where it hurts.

Now, none of this is new news; our problem is long since solved, and the Netflix Tech Blog article I'm referring to was posted last year. I've been meaning to write about it, but I've been a little busy. Maybe the timing is prophetic; AWS had a huge multi-day outage last week, which took several major websites down, along with a constellation of smaller sites.

Notably absent from that list of affected AWS sites? Netflix.

When you work with the Chaos Monkey, you quickly learn that everything happens for a reason. Except for those things which happen completely randomly. And that's why, even though it sounds crazy, the best way to avoid failure is to fail constantly.

(update: Netflix released their version of Chaos Monkey on GitHub. Try it out!)

Discussion

Revisiting the Home Theater PC

It's been almost three years since I built my home theater PC. I adore that little machine; it drives all of our family entertainment and serves as a general purpose home media server and streaming box. As I get older, I find that I'm no longer interested in having a home full of PCs whirring away. I only want one PC in my house on all the time, and I want it to be as efficient and versatile as possible.

My old low-power Athlon X2 based HTPC generally worked great, but still struggled with some occasional 1080p content. And when you have a toddler in the house, believe me, you need reliable 1080p playback. Only the finest in children's entertainment for my spawned process, I say!

When I recently had to transcode Megamind down to 720p to get it to play back without stuttering or pausing at times… I knew my current HTPC's days were numbered.

Megamind-evil-overlord-small

(Megamind is hilarious and highly recommended, by the way; it's far better than its Metacritic and Rotten Tomatoes percentages would seem to indicate.)

Now that Intel has finally released their Sandy Bridge CPUs -- the first with integrated GPUs -- I was eager to revisit and rebuild. The low power Core i3-2100T is the one I had my eye on, with a miserly TDP of 35 watts. Combine that with a decent Mini-ITX motherboard and a few other essential parts, and you're good to go:

CPUIntel Core i3-2100T$135
MotherboardASRock H67M ITX$100
RAMCorsair 4GB DDR3$45
Case + PSUAntec ISK 300-65$70
HDD750GB 2.5"$70

Now, I am fudging a bit here. This is just the basic level of hardware to get a functional home theater PC. I didn't actually buy a case, PSU, or even hard drive for that matter; I recycled many of my old existing parts, so my personal outlay was all of 300 bucks. I'm including the fuller part list as courtesy recommendations in case you're starting from scratch. You also might want to add a Blu-Ray drive, and perhaps a Windows 7 Home Premium license ($99) for its excellent 10-foot Windows Media Center interface.

Asrock-mini-itx-h67-motherboard

The magical part here is the extreme level of hardware integration: the CPU has a GPU and memory controller on die, and the motherboard has optical digital out and HDMI out built in. It's delightfully simple to build and downright cheap. Just assemble it, install your OS of choice (sorry, Apple fans), then plug it into your receiver and television and boot it up.

My results? I'll just get right to the good part, but please bear in mind each step is about twice as powerful as the one before:

2005~$1000512 MB RAM, single core CPU80 watts idle
2008~$5202 GB RAM, dual core CPU45 watts idle
2011~$4204 GB RAM, dual core CPU + GPU22 watts idle

I know I get way too excited about this stuff, but … holy crap, 22 tesla-lovin' watts at idle!

Kill-a-watt-2500t

The kill-a-watt never lies. To be fair, it's more like 25 watts idle with torrents in the background. This little box is remarkably efficient; even when playing back a 1080p video it's not unusual to see CPU usage well under 50%, which equates to around 30-35 watts in practice. Under full, artificial multithreaded Prime95 load, it tops out at an absolute peak of 55 watts.

(Update: I ended up replacing my old Seasonic ECO 300 SFX power supply with a Pico PSU-90 plus 60 watt adapter kit. That got the idle power down from 22 watts to 17 watts, a solid savings of 22%. Recommended!)

This is a killer setup, but don't take my word for it. There is an excruciatingly in-depth review of essentially the same system at Missing Remote, with a particular eye toward home theater duties. Spoiler: they loved the hell out of it too. And it compromises almost nothing in performance, with a Windows Experience score of 5.1 -- that would be a solid 5.8 if you factored out desktop Aero performance.

Windows-experience-score

(Also, in case you're wondering, I intentionally dropped the analog cable tuner. All modern cable is now digital, which means awkward DRM-ed up the wazoo CableCard systems. I've cancelled cable altogether; I'd rather take that $60+ per month and use it to support innovative companies who will deliver media through the internet, like Netflix, Hulu, etcetera. Or as I like to call it: the future, unless the media congolomerates with vaults full of cash manage to subvert net neutrality.)

When all is said and done, I have a new always-on, does-anything home theater box that is twice as fast as the one I built in 2008, while consuming less than half the power.

I've been a computer nerd since age 8, and I just turned 40. I should be jaded by computer hardware pornography by now, but I still find this progress amazing. At this rate, I can't wait to find out what my 2014 home theater PC will look like.

Discussion

The Importance of Net Neutrality

Although I remain a huge admirer of Lawrence Lessig, I am ashamed to admit that I never fully understood the importance of net neutrality until last week. Mr. Lessig described network neutrality in these urgent terms in 2006:

At the center of the debate is the most important public policy you've probably never heard of: "network neutrality." Net neutrality means simply that all like Internet content must be treated alike and move at the same speed over the network. The owners of the Internet's wires cannot discriminate. This is the simple but brilliant "end-to-end" design of the Internet that has made it such a powerful force for economic and social good: All of the intelligence and control is held by producers and users, not the networks that connect them.

Fortunately, the good guys are winning. Recent legal challenges to network neutrality have been defeated, at least under US law. I remember hearing about these legal decisions at the time, but I glossed over them because I thought they were fundamentally about file sharing and BitTorrent. Not to sound dismissive, but someone's legal right to download a complete video archive of Firefly wasn't exactly keeping me up at night.

But network neutrality is about far more than file sharing bandwidth. To understand what's at stake, study the sordid history of the world's communication networks – starting with the telegraph, radio, telephone, television, and onward. Without historical context, it's impossible to appreciate how scarily easy it is for common carriage to get subverted and undermined by corporations and government in subtle (and sometimes not so subtle) ways, with terrible long-term consequences for society.

That's the genius of Tim Wu's book The Master Switch: The Rise and Fall of Information Empires.

The-master-switch-cover

One of the most fascinating stories in the book is that of Harry Tuttle and AT&T.

Harry Tuttle was, for most of his life, president of the Hush-a-Phone Corporation, manufacturer of the telephone silencer. Apart from Tuttle, Hush-a-Phone employed his secretary. The two of them worked alone out of a small office near Union Square in New York City. Hush-a-Phone's signature product was shaped like a scoop, and it fit around the speaking end of a receiver, so that no one could hear what the user was saying on the telephone. The company motto emblazoned on its letterhead stated the promise succinctly: "Makes your phone private as a booth."

If the Hush-a-Phone never became a household necessity, Tuttle did a decent business, and by 1950 he would claim to have sold 125,000 units. But one day late in the 1940s, Henry Tuttle received alarming news. AT&T had launched a crackdown on the Hush-a-Phone and similar products, like the Jordaphone, a creaky precursor of the modern speakerphone, whose manufacturer had likewise been put on notice. Bell repairmen began warning customers that Hush-a-Phone use was a violation of a federal tariff and that, failing to cease and desist, they risked termination of their telephone service.

Was AT&T merely blowing smoke? Not at all: the company was referring to a special rule that was part of their covenant with the federal government. It stated: No equipment, apparatus, circuit or device not furnished by the telephone company shall be attached to or connected with the facilities furnished by the telephone company, whether physically, by induction, or otherwise.

Tuttle hired an attorney, who petitioned the FCC for a modification of the rule and an injunction against AT&T's threats. In 1950 the FCC decided to hold a trial (officially a "public hearing") in Washington, D.C., to consider whether AT&T, the nation's regulated monopolist, could punish its customers for placing a plastic cup over their telephone mouthpiece.

The story of the Hush-a-Phone and its struggle with AT&T, for all its absurdist undertones, offers a window on the mindset of the monopoly at its height, as well as a picture of the challenges facing even the least innovative innovator at that moment.

Absurdist, indeed – Harry Tuttle is also not-so-coincidentally the name of a character in the movie Brazil, one who attempts to work as a renegade, outside oppressive centralized government systems. Often at great peril to his own life and, well, that of anyone who happens to be nearby, too.

Harry-tuttle-brazil

But the story of Harry Tuttle isn't just a cautionary tale about the dangers of large communication monopolies. Guess who was on Harry Tuttle's side in his sadly doomed legal effort against the enormously powerful Bell monopoly? No less than an acoustics professor by the name of Leo Beranek, and an expert witness by the name of J.C.R. Licklider.

If you don't recognize those names, you should. J.C.R. Licklider went on to propose and design ARPANET, and Leo Beranek became one of the B's in Bolt, Beranek and Newman, who helped build ARPANET. In other words, these gentlemen went on from battling the Bell monopoly in court in the 1950s to designing a system in 1968 that would ultimately defeat it: the internet.

The internet is radically unlike all the telecommunications networks that have preceded it. It's the first national and global communication network designed from the outset to resist mechanisms for centralized control and monopoly. But resistance is not necessarily enough; The Master Switch makes a compelling case that, historically speaking, all communication networks start out open and then rapidly swing closed as they are increasingly commercialized.

Just as our addiction to the benefits of the internal combustion engine led us to such demand for fossil fuels as we could no longer support, so, too, has our dependence on our mobile smart phones, touchpads, laptops, and other devices delivered us to a moment when our demand for bandwidth – the new black gold – is insatiable. Let us, then, not fail to protect ourselves from the will of those who might seek domination of those resources we cannot do without. If we do not take this moment to secure our sovereignty over the choices that our information age has allowed us to enjoy, we cannot reasonably blame its loss on those who are free to enrich themselves by taking it from us in a manner history has foretold.

It's up to us to be vigilant in protecting the concepts of common carriage and network neutrality on the internet. Even devices that you may love, like an iPad, Kindle, or Xbox, can easily be turned against you – if you let them.

Discussion

How to Write Without Writing

I have a confession to make: in a way, I founded Stack Overflow to trick my fellow programmers.

Before you trot out the pitchforks and torches, let me explain.

Over the last 6 years, I've come to believe deeply in the idea that becoming a great programmer has very little to do with programming. Yes, it takes a modicum of technical skill and dogged persistence, absolutely. But even more than that, it takes serious communication skills:

The difference between a tolerable programmer and a great programmer is not how many programming languages they know, and it's not whether they prefer Python or Java. It's whether they can communicate their ideas. By persuading other people, they get leverage. By writing clear comments and technical specs, they let other programmers understand their code, which means other programmers can use and work with their code instead of rewriting it. Absent this, their code is worthless.

That is of course a quote from my co-founder Joel Spolsky, and it's one of my favorites.

In defense of my fellow programmers, communication with other human beings is not exactly what we signed up for. We didn't launch our careers in software development because we loved chatting with folks. Communication is just plain hard, particularly written communication. How exactly do you get better at something you self-selected out of? Blogging is one way:

People spend their entire lives learning how to write effectively. It isn't something you can fake. It isn't something you can buy. You have to work at it.

That's exactly why people who are afraid they can't write should be blogging.

It's exercise. No matter how out of shape you are, if you exercise a few times a week, you're bound to get fitter. Write a small blog entry a few times every week and you're bound to become a better writer. If you're not writing because you're intimidated by writing, well, you're likely to stay that way forever.

Even with the best of intentions, telling someone "you should blog!" never works. I know this from painful first hand experience. Blogging isn't for everyone. Even a small blog entry can seem like an insurmountable, impenetrable, arbitrary chunk of writing to the average programmer. How do I get my fellow programmers to blog without blogging, to write without writing?

By cheating like hell, that's how.

Consider this letter I received:

I'm not sure if you have thought about this side effect or not, but Stack Overflow has taught me more about writing effectively than any class I've taken, book I've read, or any other experience I have had before.

I can think of no other medium where I can test my writing chops (by writing an answer), get immediate feedback on its quality (particularly when writing quality trumps technical correctness, such as subjective questions) and see other peoples attempts as well and how they compare with mine. Votes don't lie and it gives me a good indicator of how well an email I might send out to future co-workers would be received or a business proposal I might write.

Over the course of the past 5 months all the answers I've been writing have been more and more refined in terms of the quality. If I don't end up as the top answer I look at the answer that did and study what they did differently and where I faltered. Was I too verbose or was I too terse? Was I missing the crux of the question or did I hit it dead on?

I know that you said that writing your Coding Horror blog helped you greatly in refining your writing over the years. Stack Overflow has been doing the same for me and I just wanted to thank you for the opportunity. I've decided to setup a coding blog in your footsteps and I just registered a domain today. Hopefully that will go as well as writing on SO has. There are no tougher critics than fellow programmers who scrutinize every detail, every technical remark and grammar structure looking for mistakes. If you can effectively write for and be accepted by a group of programmers you can write for anyone.

Joel and I have always positioned Stack Overflow, and all the other Stack Exchange Q&A sites, as lightweight, focused, "fun size" units of writing.

Yes, by God, we will trick you into becoming a better writer if that's what it takes – and it always does. Stack Overflow has many overtly gamelike elements, but it is a game in service of the greater good – to make the internet better, and more importantly, to make you better. Seeing my fellow programmers naturally improve their written communication skills while participating in a focused, expert Q&A community with their peers? Nothing makes me prouder.

Beyond programming, there's a whole other community of peers out there who grok how important writing is, and will support you in sharpening your saw, er, pen. We have our own, too.

Writers Stack Exchange

If you're an author, editor, reviewer, blogger, copywriter, or aspiring writer of any kind, professional or otherwise – check out writers.stackexchange.com. Becoming a more effective writer is the one bedrock skill that will further your professional career, no matter what you choose to do.

But mostly, you should write. I thought Jon Skeet summed it up particularly well here:

Everyone should write a lot – whether it's a blog, a book, Stack Overflow answers, emails or whatever. Write, and take some care over it. Clarifying your communication helps you to clarify your own internal thought processes, in my experience. It's amazing how much you find you don't know when you try to explain something in detail to someone else. It can start a whole new process of discovery.

The process of writing is indeed a journey of discovery, one that will last the rest of your life. It doesn't ultimately matter whether you're writing a novel, a printer review, a Stack Overflow answer, fan fiction, a blog entry, a comment, a technical whitepaper, some emo LiveJournal entry, or even meta-talk about writing itself. Just get out there and write!

Discussion