Coding Horror

programming and human factors

Strong Opinions, Weakly Held

I seldom pause to answer criticism of my blog. If I did, I'd have time for little else in the course of the day, and no time for constructive work. But occasionally I'll encounter a particularly well written critique that gives me pause, such as Alastair Rankine's Blogging Horror. Since I feel that Alastair wrote it out of genuine good will, and that his criticisms are sincerely set forth, I want to try to answer his statement in what I hope will be patient and reasonable terms.

However, Coding Horror has become so popular that Atwood has quit his day job and struck out on his own. To my mind, this raises the bar somewhat. Professional bloggers deserve more scrutiny than dabblers, just as in many other fields.

Not only has Atwood has gone pro with his blog, but has recently started a venture called stackoverflow to collate accepted wisdom from the software development community. It is early days, but from what I can gather there is still likely to be Atwood's editorial hand in the output, despite intentions of adopting community generated content.

In other words, Atwood seems to be setting himself up as an authority figure on software development and, well, I have some issues with this.

I'd like to first answer this with two slides from my January CUSEC presentation, presented here verbatim with no modifications.

What have I really done? Don't own a company. Didn't participate in an important startup. Didn't author a framework or standard. Haven't made a lot of money. Nothing. There is absolutely no reason any of you should listen to me. But somehow, I have 75,000 RSS subscribers and over 50,000 pageviews/day. It's a mystery to me, too.

Authority in our field is a strange thing. Perceived authority is stranger still.

I've always thought of myself as nothing more than a rank amateur seeking enlightenment. This blog is my attempt to invite others along for the journey. It has become a rather popular journey along the way, which has subtly altered the nature of the journey and the way I approach it, but the goal remains the same.

It troubles me greatly to hear that people see me as an expert or an authority, and not a fellow amateur:

When I got back to Boston I went to the library and discovered a book by Kimura on the subject, and much to my disappointment, all of our "discoveries" were covered in the first few pages. When I called back and told Richard what I had found, he was elated. "Hey, we got it right!" he said. "Not bad for amateurs."

In retrospect I realize that in almost everything that we worked on together, we were both amateurs. In digital physics, neural networks, even parallel computing, we never really knew what we were doing. But the things that we studied were so new that no one else knew exactly what they were doing either. It was amateurs who made the progress.

These people are industry giants, so any comparison between them and myself is accidental. It's the overall point they're making that I want to call your attention to: software is an incredibly young discipline. Everything in software is so new and so frequently being reinvented that almost nobody really knows what they are doing. It is amateurs who make all the progress.

When it comes to software development, if you profess expertise, if you pitch yourself as an authority, you're either lying to us, or lying to yourself. In our heart of hearts, we know: the real progress is made by the amateurs. They're so busy living software they don't usually have time to pontificate at length about the breadth of their legendary expertise. If I've learned anything in my career, it is that approaching software development as an expert, as someone who has already discovered everything there is to know about a given topic, is the one surest way to fail.

Experts are, if anything, more suspect than the amateurs, because they're less honest. Regardless, you absolutely should question everything I write here, in the same way you question everything you've ever read online -- or anywhere else for that matter. Your own research and data should trump any claims you read from anyone, no matter how much of an authority or expert you, I, Google, or the general community at large may believe them to be.

But if, as Alastair correctly points out, I now derive a significant part of my income from blogging, doesn't that make me a professional blogger by definition? I thought Dave Winer had a great explanation that I'll gladly co-opt:

Now if you ask me – there never was such a thing as a pro blogger. It's a contradiction in terms. It's like calling someone a professional amateur. It's like salty orange juice, a drink whose taste is derived from its acidity. Blogging is an amateur activity. It's users writing about what they do, not professionals writing about what users do.

What Dave's describing here is the difference between a journalist writing about programmers versus a programmer writing about programming. Blogging does not mean observing from the outside; it means participation. I like to think what I do at Coding Horror is a byproduct of shipping software, not some sort of bizarre sociological experiment I'm conducting. Although sometimes, I'll admit, it does feel that way. I am a generalist with a decidedly lowbrow coding background, so I can be a little scatterbrained. But directly or indirectly, everything I've ever written on this blog is a side-effect of my deep, lifelong love of my ongoing work as a programmer.

You could argue that I'm a better writer than programmer. Perhaps that's true. I'll be the first to tell you that I am not an exceptional programmer. A competent programmer, yes. Always. On a good day, perhaps even a decent programmer. But I don't kid myself, either. I'll never be one of the best. But what I lack in talent, I make up in intensity.

Which means, mathematically speaking, I must be pretty damn intense.

The bite-sized morsels posted to Coding Horror are all very well for bite-sized topics. But things can often go awry if the topic is too complex to be distilled down easily. Oversimplification often ensues, as in the following examples, all recent:

  • An attempted critique of XML ...
  • A similar "it's-too-hard" reaction seems to be at the heart of an article on humane markup languages ...
  • Admittedly Model-View-Controller is an increasingly vague concept these days, but I just couldn't buy Atwood's example of it ...
  • A comment that software forking is "the very embodiment of freedom zero" demonstrates that Atwood has no idea what freedom zero is ...

Common to all of these are a superficial understanding of the topic at hand. In short, Atwood just isn't credible.

Maybe a little too intense, sometimes. It's almost like I'm trying to overcompensate for something, but I can't imagine what that could be.

I'm Rex, founder of the Rex Kwon Do self-defense system! After one week with me in my dojo, you'll be prepared to defend yourself with the STRENGTH of a grizzly, the reflexes of a PUMA, and the wisdom of a MAN.

Like Rex of Rex Kwon Do, perhaps I'm relying a bit too heavily on the "Smackdown" learning model here in my dojo. I use it because I personally find it incredibly effective, for all the reasons that Kathy Sierra outlines.

But I worry that for some, it's getting in the way, damaging the credibility of the underlying message. Instead of arriving at the desired learning part, all they're getting is the smackdown. I certainly hope my posts are read and understood as slightly more nuanced than "Everything About PHP Sucks", "Everything About XML Sucks", or my personal favorite, "Everything About (your favorite technology) Sucks. Seriously."

I suppose it's also an issue of personal style. To me, writing without a strong voice, writing filled with second guessing and disclaimers, is tedious and difficult to slog through. I go out of my way to write in a strong voice because it's more effective. But whenever I post in a strong voice, it is also an implied invitation to a discussion, a discussion where I often change my opinion and invariably learn a great deal about the topic at hand. I believe in the principle of strong opinions, weakly held:

A couple years ago, I was talking the Institute's Bob Johansen about wisdom, and he explained that – to deal with an uncertain future and still move forward – they advise people to have "strong opinions, which are weakly held." They've been giving this advice for years, and I understand that it was first developed by [former] Institute Director Paul Saffo. Bob explained that weak opinions are problematic because people aren't inspired to develop the best arguments possible for them, or to put forth the energy required to test them. Bob explained that it was just as important, however, to not be too attached to what you believe because, otherwise, it undermines your ability to "see" and "hear" evidence that clashes with your opinions. This is what psychologists sometimes call the problem of "confirmation bias."

So when you read one of my posts and hear this:

My name is Rex, and if you study with my eight-week program you will learn a system of self defense that I developed over two seasons of fighting in the Octagon. It's called... Rex Kwon Do!

Please consider it a strong opinion weakly held, a mock fight between fellow amateurs of equal stature, held in an Octagon where everyone retains their sense of humor, has an open mind, and enjoys a spirited debate where we all learn something.

Now bow to your sensei! Bow to your sensei!

Discussion

Designing For Evil

Have you ever used Craigslist? It's an almost entirely free, mostly anonymous classified advertising service which evolved from an early internet phenomenon into a service so powerful it is often accused of single-handedly destroying the newspaper business. Unfortunately, these same characteristics also make Craigslist a particularly juicy target for spammers and evildoers. Who knows; maybe it's karma.

I consider Craiglist a generally benevolent public service. Perhaps that's why I was so disturbed by John Nagle's wartime narrative of the raging battle between Craigslist and spammers.

Spam on Craigslist has been a minor nuisance for years. Not any more. This year, the spammers started winning and are taking over Craigslist. Here's how they did it. Craigslist tries to stop spamming by:

  • Checking for duplicate submissions.
  • Blocking excessive posts from a single IP address.
  • Requiring users to register with a valid email address.
  • Using a CAPTCHA to stop automated posting tools.
  • Letting users flag postings they recognize as spam.

Several commercial products are now available to overcome those little obstacles to bulk posting. CL Auto Posting Tool is one such product. It not only posts to Craigslist automatically, it has built-in strategies to overcome each Craigslist anti-spam mechanism:

  • Random text is added to each spam message to fool Craigslist's duplicate message detector.
  • IP proxy sites are used to post from a wide range of IP addresses.
  • E-mail addresses for reply are Gmail accounts conveniently created by Jiffy Gmail Creator (ed. note: this does not break Google's CAPTCHA, as you can see in this screenshot.)
  • An OCR system reads the obscured text in the CAPTCHA.
  • Automatic monitoring detects when a posting has been flagged as spam and reposts it.

CL Auto Poster isn't the only such tool. Other desktop software products are AdBomber and Ad Master. For spammers preferring a service-oriented approach, there's ItsYourPost. With these power tools, the defenses of Craigslist have been overrun. Some categories on Craigslist have become over 90% spam. The personals sections were the first to go, then the services categories, and more recently, the job postings.

Craigslist is fighting back. Its latest gimmick is phone verification. Posting in some categories now requires a callback phone call, with a password sent to the user either by voice or as an SMS message. Only one account is allowed per phone number. Spammers reacted by using VoIP numbers. Craigslist blocked those. Spammers tried using number-portability services like Grand Central and Tossable Digits. Craigslist blocked those. Spammers tried using their own free ringtone sites to get many users to accept the Craigslist verification call, then type in the password from the voice message. Craigslist hasn't countered that trick yet.

Much of the back and forth battle can be followed in various forums. It's not clear yet who will win.

I've used Craigslist quite a few times in the past, mostly to sell things that are too unwieldy to ship, with generally positive results. But that's the "for sale" section, and the spammers seem to be concentrating on the personals and services. I was curious about this, so I delved into the local personals section in what I guessed to be the most popular category. (Note to my wife: this is research! Research! I swear!)

Almost immediately I found a personals ad with the following "image":

Craigslist anti-scam image

It's an encoded wartime transmission from someone battling Craigslist spammers. It ends on this dire warning:

99.9% of the ads these days are fakes. Sad but true. REALLY, ALMOST ALL THE ADS ARE FAKE!

But is it true? I saw some obvious spam in the personals section – all of which had been flagged for removal by the time I clicked on it – but certainly nothing to corroborate this 99.9% claim. I did a few unique term searches on random personals (my favorite at the moment is "no murderers please!"), and they came up unique.

Clearly, there's a war on, and there have been casualties on both sides. Even if the spammers aren't winning, every inch they gain further undermines the community's trust in Craigslist and devalues everyone's participation.

This is a topic I am acutely interested in as we build stackoverflow.com out. Like Craigslist, Stack Overflow will offer a rich experience for anonymous internet users. We will not require you to create an account or "login" to answer or ask questions. We'll even track your reputation and preferred settings for you, as long as you allow us to store a standard browser cookie. While it's true that we'll initially be a low-value target due to limited traffic and a specialized audience, that will inevitably change over time. So you can expect some of the same measures on Stack Overflow (and, later, Discourse) that Craigslist and Wikipedia use to mitigate anonymous evil:

  • Some form of CAPTCHA.
  • The ability to temporarily "lock" controversial questions so only registered users can edit or add responses.
  • An automatic throttle if we see rapid, bot-like actions from your IP address.
  • Some basic heuristics to detect "spammy" content, such as too many URLs, or typing inhumanly fast.
  • An easy way for users with sufficient reputation to undo vandalism by reverting to an earlier version.

The community itself can also assist. Every question and answer on Stack Overflow can be rated Digg style; if a given bit of content rapidly accrues a large number of downmods, it is likely to be spam or inappropriate content, and will be automatically removed or directed into a moderation queue.

Don't get me wrong. I've been humbled by the quality – and the sheer size – of the community that has grown up around this blog. I expect the overwhelming majority of people who participate in Stack Overflow will be upstanding Internet citizens. Wikipedia is a living testament to the fact that goodness vastly outnumbers evil. We good guys can win, if we have the forethought to put some controls in place first.

Allowing anonymous users to post creates a volatile situation where a dozen sufficiently motivated spammers can easily poison the well for thousands of typical users. These spammers don't give a damn about the community we're building together. All they care about is getting paid by posting their links anywhere and everywhere they can. They'll run roughshod over as many websites and pages as possible in their frantic, abusive pursuit of money. If I didn't so desperately want to choke the life out of each and every one of them, I might actually feel sorry for the poor bastards.

But here's the problem: following the rules and being a good citizen is easy. Being evil is hard; it takes more work. Sometimes a lot more work. The bad guys get paid to learn about their exploits. Are you willing to educate yourself about the complex evil that a tiny minority of powerful users are prepared to unleash upon your site?

As with so many things in life, this is best illustrated by a scene from Spaceballs:

So, Lone Starr, now you see that evil will always triumph, because good is dumb.

As the good guys, we can't afford to be ignorant of the spammers' techniques. If that means spelunking through the grimiest corners of some scummy black hat forums, then so be it. I'll tell you this: I've never nofollowed a single link on this blog until today. The most effective way to fight the evil spammers is to understand them, and the first step toward understanding evil is openly linking to their tools and methods, exposing them to as much public scrutiny as possible.

When you design your software, work under the assumption that some of your users will be evil: out to game the system, to defeat it at every turn, to cause interruption and denial of service, to attack and humiliate other users, to fill your site with the vilest, nastiest spam you can possibly imagine. If you don't do that, you'll end up with something like blog trackbacks, which are irreparably busted at this point. Trackbacks are the source of countless untold hours of institutionalized spam pain and suffering, all because the initial designers apparently did not ask themselves one simple question: what if some of our users are evil?

Because when good is dumb, evil will always triumph.

Websites that allow users to post content will always be vulnerable to the actions of a handful of evil, spammy users. It's not pleasant. It is a dark mirror into the ugly underbelly of human nature. But it's also an unfortunate, unavoidable fact of life. And when you fail to design for evil, you have failed your community.

Discussion

It's Clay Shirky's Internet, We Just Live In It

I can't remember when, exactly, I discovered Clay Shirky, but I suspect it was around 2003 or so. I sent him an email about micropayments, he actually answered it, and we had a rather nice discussion on the topic. I've been a fan of Clay's writing ever since. (In case you're curious, Clay was right -- micropayments are dead -- and I was dead wrong. All the more reason to be a fan.)

I don't think you'll find a smarter, more articulate writer on the topic of internet community than Clay Shirky. His A Group Is Its Own Worst Enemy, for example, is the seminal article on the folly of addressing social software problems purely through technology. I've referenced Clay a number of times on this blog, and his writing seems more and more prescient with each passing year. It's Clay Shirky's Internet; we just live in it.

Gin, Television, and Social Surplus is a more recent example:

Did you ever see that episode of Gilligan's Island where they almost get off the island and then Gilligan messes up and then they don't? I saw that one. I saw that one a lot when I was growing up. And every half-hour that I watched that was a half an hour I wasn't posting at my blog or editing Wikipedia or contributing to a mailing list. Now I had an ironclad excuse for not doing those things, which is none of those things existed then. I was forced into the channel of media the way it was because it was the only option. Now it's not, and that's the big surprise. However lousy it is to sit in your basement and pretend to be an elf, I can tell you from personal experience it's worse to sit in your basement and try to figure if Ginger or Mary Ann is cuter.

And I'm willing to raise that to a general principle. It's better to do something than to do nothing. Even lolcats, even cute pictures of kittens made even cuter with the addition of cute captions, hold out an invitation to participation. When you see a lolcat, one of the things it says to the viewer is, "If you have some sans-serif fonts on your computer, you can play this game, too." And that message -- I can do that, too -- is a big change.

This is something that people in the media world don't understand. Media in the 20th century was run as a single race -- consumption. How much can we produce? How much can you consume? Can we produce more and you'll consume more? And the answer to that question has generally been yes. But media is actually a triathlon, it 's three different events. People like to consume, but they also like to produce, and they like to share.

It's exactly this sort of deep, penetrating insight which makes me wonder if Clay Shirky will be looked back on as one of the key historical figures of the nascent internet era. Maybe I'm just a naive fanboy, but the guy seems to see a lot farther than everyone else. So you can imagine the great interest I had in Clay's new book, Here Comes Everybody: The Power of Organizing Without Organizations.

Here Comes Everybody: The Power of Organizing Without Organizations

(I'm showing the UK version of the book cover because it's about a zillion times better than the US cover. Seriously, what were they thinking?)

After reading Here Comes Everybody, I'm happy to report that it does not disappoint. I'd even go so far as to say if you're developing social software of any kind, this book should be required reading. I feel so strongly about this, in fact, that I just gave my copy to my stackoverflow coding partner. And I will be following up with pop quizzes. What's that, you say? You don't develop social software? Are you sure?

So I said, narrow the focus. Your "use case" should be, there's a 22 year old college student living in the dorms. How will this software get him laid?

That got me a look like I had just sprouted a third head, but bear with me, because I think that it's not only crude but insightful. "How will this software get my users laid" should be on the minds of anyone writing social software (and these days, almost all software is social software).

"Social software" is about making it easy for people to do other things that make them happy: meeting, communicating, and hooking up.

As Jamie Zawinski once said, these days, almost all software is social software.

If you're not able to devote the time to the book, I encourage you to at least check out Clay's 42 minute presentation on "Here Comes Everybody" from earlier this year.

I found the introduction particularly inspiring; I've transcribed it here.

I've been writing principally for an audience of programmers and engineers and techies and so forth for about a dozen years. I wanted to write this book for a general audience, because the effects of the internet are now becoming broadly social enough that there is a general awareness that the internet isn't a decoration on contemporary society, but a challenge to it. A society that has an internet is a different kind of society, in the same way that a society that has a printing press was a different kind of society. We're living through the largest increase in human expressive capability in history.

It's a big claim. There are really only four revolutions that could compete for that:

  1. The printing press and movable type considered as one broad period of innovation.
  2. Telegraph and telephone considered as one broad period of innovation.
  3. Recorded media of all types, first images, then sound, then moving images, then moving images with sound.
  4. Finally, the ability to harness broadcast.

These are the media revolutions that existed as part of the landscape prior to our historical generation. There is a curious asymmetry to them, which is the ones that create groups don't create two-way communication, and the ones that create two-way communications don't create groups. Either you had something like a magazine or television, where the broadcast was from the center to the edge, but the relationship was between producer and consumer. Or you had something like the telephone, where people could engage in a two-way conversation, but the medium didn't create any kind of group.

And then there's now. What we've got is a network that is natively good at group forming. In fact, this isn't just a fifth revolution. It holds the contents of the previous revolutions, which is to say we can now distribute music and movies and conversations all in this medium. But the other thing it does is move us into a world of two-way groups. Thirty years from now, when I'm presenting this book, if I had to describe it in one bullet point -- this is what the bullet point would say:

Group Action Just Got Easier.

This is, in the context of change in our historical generation, the big deal. This isn't just a new way of broadcasting information, it isn't just a new way of having two way communication, it actually engages groups. In this medium, freedom of speech, freedom of the press, and freedom of assembly are all now the same freedom. And the spread of that capability is the big deal.

Now, it could be that blogging and working on stackoverflow is clouding my perspective, making these social software issues unusually relevant to my work. When I wrote:

I realized, that's it. That's it exactly. That is what is so intensely satisfying about writing here. My happiness only becomes real when I share it with all of you.

I didn't realize the serendipitous parallels between that sentiment and Clay's claim that the internet runs on love:

In the past, we could do little things for love, but big things, big things required money. Now, we can do big things for love.

I have no idea if stackoverflow will be a "big thing" or not. But it sure is nice to wake up in the morning and work on building a community of people who love computers and code as much as I do.

I love code

Or maybe I'm just a hopeless romantic.

Discussion

OpenID: Does The World Really Need Yet Another Username and Password?

As we continue to work on the code that will eventually become stackoverflow, we belatedly realized that we'd be contributing to the glut of username and passwords on the web. I have fifty online logins, and I can't remember any of them! Adding that fifty-first set of stackoverflow.com credentials is unlikely to help matters.

With some urging from my friend Jon Galloway, I decided to take a look at OpenID. OpenID aims to solve the login explosion problem:

OpenID eliminates the need for multiple usernames across different websites, simplifying your online experience.

You get to choose the OpenID Provider that best meets your needs and most importantly that you trust. At the same time, your OpenID can stay with you, no matter which Provider you move to. And best of all, the OpenID technology is not proprietary and is completely free.

In the spirit of Show, Don't Tell, here's how it works:

Let's say you're visiting a new website for the first time. As you browse around, eventually you'll do something that requires more than anonymous guest access. So you'll get shunted to the "create a new account" page, in whatever form that takes. I'm sure everyone reading this knows the drill. But if the website is OpenID enabled, you don't have to go through all the typical rigamarole necessary to create a new account. Instead, you can enter your OpenID login:

openid login

I'm going to indulge in a bit of hand waving here and assume that you already have an OpenID login. It's not such a terrible stretch, honestly; every AOL and Yahoo user already has an OpenID login even if they don't know it yet.

OpenIDs are technically URLs. Here are a few examples:

  • http://claimid.com/yourname
  • http://yourname.signon.com
  • https://me.yahoo.com/yourname

That's one usability problem with OpenID: you have to remember a relatively complete personal URL that no two OpenID providers define the same way. Which compares unfavorably to, say, remembering your email address. There are shortcuts around this that I'll describe later, but for now, there's ID selector, which provides a reasonably friendly UI for building an OpenID login URL.

openid login helper

If you enter the right URL, you'll get redirected back to your OpenID provider, where you'll enter your single set of login credentials.

openid provider login

You'll be prompted to add this site to your provider's list of "trusted sites" for your account. Once you do this, you can bypass all of these steps the next time you're on the site.

openid-2-transfer.png

And, finally, you're logged in for the first time!

openid return

If that seems like extra work -- and remember, I'm not counting the time it took to set up the initial account at ClaimID, either -- well, I won't lie to you. It is more work. But it's worth noting that:

  1. The cost of account creation at your OpenID provider can eventually be amortized across dozens of sites which will all accept those same credentials.

  2. After the first OpenID login at a particular site, assuming you've added that site to your trust list, subsequent logins are literally one-click operations.

It's not exactly frictionless, but it's a heck of an improvement over having to remember 50 different usernames and passwords for 50 different websites, wouldn't you say? I think it compares quite favorably with the current champion of frictionless communication: anonymous comment boxes. They typically have three fields to fill out: username, URL, and email. OpenID requires only one. Your provider can proxy your URL and email back to the blog automatically from your provider profile, if you choose a smart provider with attribute exchange support.

Which brings me to the other problem with OpenID. The quality of your OpenID experience is heavily influenced by the provider you choose. For example, Yahoo! is smart enough to work even if you enter nothing but "yahoo.com" as your OpenID URL. That is, assuming you've enabled OpenID support for your Yahoo! login. Providers can also offer unique functionality that sets them apart, too. For example, SignOn.com allows the use of Information Cards in Windows, so you can log into a website without ever typing in a password! It's a bit of work, as you have to associate the Information Card with your provider account first, but I tried it, and it works as advertised.

My experiments with OpenID were quite positive, but all is not wine and roses in the land of OpenID. Stefan Brands identifies some potentially large problems with OpenID, backed by exhaustive references:

  1. Phishing. A malicious site could visit the OpenID provider URL you gave it, screen-scrape your login form, and present it locally, intercepting your login and password. However, if you choose a quality OpenID provider, they'll use SSL and a high-grade certificate so you'll have some confidence you're not being fooled. Yahoo also offers anti-phishing image watermarks for OpenID logins, as well.

  2. Privacy. Your OpenID provider will know, by definition, every site you log into using its credentials. So I hope you trust your provider.

  3. Centralized Risk. If your OpenID account is compromised, every site you used to access it is also compromised. I'm not sure how much riskier this is than having your email credentials compromised, as many (most?) sites allow you to send a password reset to your email address.

  4. Lack of Trust. The OpenID providers provide no identity checking whatsoever. It's sort of like those generic "identity cards" you can obtain online, which are pretty useless next to, say, your Driver's License, which was issued by a local governmental authority. What if Fake Steve Jobs created a fake OpenID purporting to be Steve Jobs, or a fake OpenID provider?

  5. Additional Complexity. Your login now involves two completely different entities: the website you're attempting to gain access to, and your OpenID provider. You have to understand this new relationship to troubleshoot any problems with your login -- and the OpenID provider has to be up and running for you to log in at all.

  6. Adoption Inequality. It's easy for AOL, Yahoo!, Six Apart, and Technorati to become OpenID providers -- but what good does that do you when there are very few OpenID consumers? As Dare points out, there are no financial incentives to accept credentials from your competitors, but there are certainly plenty of incentives for driving account creation on your own site. For now, I expect OpenID to be driven primarily by small applications and sites that don't have millions of dollars of skin in the game.

As I mentioned above, I feel most of these criticisms can be mitigated by picking a quality, trustworthy OpenID Provider. Particularly one that uses SSL. Since it's an open ecosystem, I'd hope the more reputable and reliable OpenID providers would rise to the top. And consider the advantages: as an application developer, you no longer have to store passwords! That's a huge advantage, because storing passwords is the last business you want to be in. Trust me on this one.

I also found Jan Miksovsky's criticisms of the user experience of OpenID -- as of 6 months ago -- fairly damning:

And all this is for -- what, exactly? To save me from having to pick a user name and password? As annoying as that can be, it's just not that hard! Remembering an arbitrary user name does cause real trouble, but simply allowing email addresses to be used as IDs can solve almost all of that problem. As more and more sites allow email addresses as IDs, the need for OpenID becomes less compelling to a consumer.

For the time being, I can't imagine a sane business operator forcing their precious visitors through this gauntlet of user experience issues just for the marginal benefits that accrue to a shared form of ID. I've read numerous claims that all it will take is for someone big like Google to support OpenID to crack this problem open. Unfortunately, there's no business of any size that can afford to direct their traffic down a dead end.

Most service operators will, at best, offer users a choice between using a proprietary ID or an OpenID, creating a terrible economic proposition for a consumer. Faced with the proposition of: 1) struggling once for thirty minutes to struggle through a process they can barely understand, or 2) spending two minutes on every new site breezing through a familiar process they've done countless times before, normal busy people will choose the familiar route time and time again. I'll bet anything that most people will keep going for proprietary IDs, further deferring the network effects possible from OpenID adoption.

Perhaps the most compelling point Jan makes is this one: it is a bit odd to ask users to associate themselves with an arbitrary URL instead of an email address. I definitely saw some rough edges in today's experimentation, but I'd say the user experience has improved since Jan looked at OpenID. That's encouraging.

I realize that OpenID is far from an ideal solution. But right now, the one-login-per-website problem is so bad that I am willing to accept these tradeoffs for a partial worse is better solution. There's absolutely no way I'd put my banking credentials behind an OpenID. But there are also dozens of sites that I don't need anything remotely approaching banking-grade security for, and I use these sites far more often than my bank. The collective pain of remembering all these logins -- and the way my email inbox becomes a de-facto collecting point and security gateway for all of them -- is substantial.

If you're a software developer building an application that requires user accounts, please consider using OpenID rather than polluting the world with yet another login and password. I also encourage you to experiment with OpenID as a user. Create one. Try logging in somewhere with one. If you don't like the experience, or if you agree with one (or more) of the criticisms I listed above, how can we collectively fix it? We desperately need a solution to the login explosion, and right now the only thing I've seen on the horizon that has any kind of critical mass whatsoever is OpenID.

If we can't make OpenID work, at least for run of the mill, low-value credentials that litter the web in increasing numbers -- what hope do we have of ever fixing the login explosion problem?

Discussion

PHP Sucks, But It Doesn't Matter

Here's a list of every function beginning with the letter "A" in the PHP function index:

abs()
acos()
acosh()
addcslashes()
addslashes()
aggregate()
aggregate_info()
aggregate_methods()
aggregate_methods_by_list()
aggregate_methods_by_regexp()
aggregate_properties()
aggregate_properties_by_list()
aggregate_properties_by_regexp()
aggregation_info()
apache_child_terminate()
apache_get_modules()
apache_get_version()
apache_getenv()
apache_lookup_uri()
apache_note()
apache_request_headers()
apache_reset_timeout()
apache_response_headers()
apache_setenv()
apc_add()
apc_cache_info()
apc_clear_cache()
apc_compile_file()
apc_define_constants()
apc_delete()
apc_fetch()
apc_load_constants()
apc_sma_info()
apc_store()
apd_breakpoint()
apd_callstack()
apd_clunk()
apd_continue()
apd_croak()
apd_dump_function_table()
apd_dump_persistent_resources()
apd_dump_regular_resources()
apd_echo()
apd_get_active_symbols()
apd_set_pprof_trace()
apd_set_session()
apd_set_session_trace()
apd_set_socket_session_trace()
array()
array_change_key_case()
array_chunk()
array_combine()
array_count_values()
array_diff()
array_diff_assoc()
array_diff_key()
array_diff_uassoc()
array_diff_ukey()
array_fill()
array_fill_keys()
array_filter()
array_flip()
array_intersect()
array_intersect_assoc()
array_intersect_key()
array_intersect_uassoc()
array_intersect_ukey()
array_key_exists()
array_keys()
array_map()
array_merge()
array_merge_recursive()
array_multisort()
array_pad()
array_pop()
array_product()
array_push()
array_rand()
array_reduce()
array_reverse()
array_search()
array_shift()
array_slice()
array_splice()
array_sum()
array_udiff()
array_udiff_assoc()
array_udiff_uassoc()
array_uintersect()
array_uintersect_assoc()
array_uintersect_uassoc()
array_unique()
array_unshift()
array_values()
array_walk()
array_walk_recursive()
ArrayIterator::current()
ArrayIterator::key()
ArrayIterator::next()
ArrayIterator::rewind()
ArrayIterator::seek()
ArrayIterator::valid()
ArrayObject::__construct()
ArrayObject::append()
ArrayObject::count()
ArrayObject::getIterator()
ArrayObject::offsetExists()
ArrayObject::offsetGet()
ArrayObject::offsetSet()
ArrayObject::offsetUnset()
arsort()
ascii2ebcdic()
asin()
asinh()
asort()
aspell_check()
aspell_check_raw()
aspell_new()
aspell_suggest()
assert()
assert_options()
atan()
atan2()
atanh()

I remember my first experience with PHP way back in 2001. Despite my questionable pedigree in ASP and Visual Basic, browsing an alphabetical PHP function list was enough to scare me away for years. Somehow, perusing the above list, I don't think things have improved a whole lot since then.

I'm no language elitist, but language design is hard. There's a reason that some of the most famous computer scientists in the world are also language designers. And it's a crying shame none of them ever had the opportunity to work on PHP. From what I've seen of it, PHP isn't so much a language as a random collection of arbitrary stuff, a virtual explosion at the keyword and function factory. Bear in mind this is coming from a guy who was weaned on BASIC, a language that gets about as much respect as Rodney Dangerfield. So I am not unfamiliar with the genre.

Of course, this is old news. How old? Ancient. Internet Explorer 4 old. The internet is overrun with PHP sucks articles – I practically ran out of browser tabs opening them all. Tim Bray bravely bucked this trend and went with the title On PHP for his entry in the long-running series:

So here's my problem, based on my limited experience with PHP (deploying a couple of free apps to do this and that, and debugging a site for a non-technical friend here and there): all the PHP code I've seen in that experience has been messy, unmaintainable crap. Spaghetti SQL wrapped in spaghetti PHP wrapped in spaghetti HTML, replicated in slightly-varying form in dozens of places.

Tim's article is as good a place to start as any; he captured a flock of related links in the ensuing discussion. As you read, you'll find there's an obvious parallel between the amateurish state of PHP development and Visual Basic 6, a comparison that many developers have independently arrived at.

Fredrik Holmstrm:

Every solution I've ever seen or developed in PHP feels clunky and bulky, there is no elegance or grace. Working with PHP is a bit like throwing a 10 pound concrete cube from a ten story building: You'll get where you're going fast, but it's not very elegant. ... I love PHP, and it's the right tool for some jobs. It's just an ugly, cumbersome tool that makes me cry and have nightmares. It's the new VB6 in a C dress.

Karl Seguin

From my own experience, and the countless of online tutorials and blogs, many PHP developers are guilty of the same crap code VB developers were once renowned for. OO, N-Tier, exception handling, domain modeling, refactoring and unit testing are all foreign concepts in the PHP world.

Understand that as a long time VB developer, I am completely sympathetic to the derision you'll suffer when programming in a wildly popular programming language that isn't considered "professional".

TIOBE index, may 2008

I've written both VB and PHP code, and in my opinion the comparison is grossly unfair to Visual Basic. Does PHP suck? Of course it sucks. Did you read any of the links in Tim's blog entry? It's a galactic supernova of incomprehensibly colossal, mind-bendingly awful suck. If you sit down to program in PHP and have even an ounce of programming talent in your entire body, there's no possible way to draw any other conclusion. It's inescapable.

But I'm also here to tell you that doesn't matter.

The TIOBE community index I linked above? It's written in PHP. Wikipedia, which is likely to be on the first page of anything you search for these days? Written in PHP. Digg, the social bookmarking service so wildly popular that a front page link can crush the beefiest of webservers? Written in PHP. WordPress, arguably the most popular blogging solution available at the moment? Written in PHP. YouTube, the most widely known video sharing site on the internet? Written in PHP. Facebook, the current billion-dollar zombie-poking social networking darling of venture capitalists everywhere? Written in PHP. (Update: While YouTube was originally written in PHP, it migrated to Python fairly early on, per Matt Cutts and Guido van Rossum.)

Notice a pattern here?

Some of the largest sites on the internet – sites you probably interact with on a daily basis – are written in PHP. If PHP sucks so profoundly, why is it powering so much of the internet?

The only conclusion I can draw is that building a compelling application is far more important than choice of language. While PHP wouldn't be my choice, and if pressed, I might argue that it should never be the choice for any rational human being sitting in front of a computer, I can't argue with the results.

You've probably heard that sufficiently incompetent coders can write FORTRAN in any language. It's true. But the converse is also true: sufficiently talented coders can write great applications in terrible languages, too. It's a painful lesson, but an important one.

Why fight it? I say learn to embrace it. Join with me, won't you, in celebrating the next fifty years of glorious PHP code driving the internet. Just don't forget to call the maintain_my_will_to_live() PHP function every so often!

Discussion