Coding Horror

programming and human factors

When In Doubt, Make It Public

Marc Hedlund offered some unique advice to web entrepreneurs last month:

One of my favorite business model suggestions for [web] entrepreneurs is to find an old UNIX command that hasn't yet been implemented on the web, and fix that.

To illustrate, Marc provides a list of UNIX commands with their corresponding web implementations:

talk, fingerICQ
LISTSERVDejaNews
lsYahoo! directory
find, grepGoogle
rnBloglines
pineGoogle Mail
mountAmazon S3
bashYahoo! Pipes
wallTwitter

Jason Kottke noted that most successful "new" business models on the web aren't new at all-- they're simply taking what was once private and making it public and permanent:

Blogger = public email messages. (1999) Instead of "Dear Bob, Check out this movie." it's "Dear People I May or May Not Know Who Are Interested in Film Noir, check out this movie. If you like it, maybe we can be friends."

Flickr = public photo sharing. (2004) Flickr co-founder Caterina Fake said in a recent interview: "When we started the company, there were dozens of other photosharing companies such as Shutterfly, but on those sites there was no such thing as a public photograph -- it didn't even exist as a concept -- so the idea of something 'public' changed the whole idea of Flickr."

YouTube = public home videos. (2005) Bob Saget was onto something.

Twitter = public IM. (2006) I don't think it's any coincidence that one of the people responsible for Blogger is also responsible for Twitter.

But you don't have to found a new Web 2.0 company to benefit from the power of public information. Even brick and mortar companies are finally realizing that the age-old principle of "secret by default" may not be the best policy today:

Companies used to assume that details about their internal workings were valuable precisely because they were secret. If you were cagey about your plans, you had the upper hand; if you kept your next big idea to yourself, people couldn't steal it. Now, billion- dollar ideas come to CEOs who give them away; corporations that publicize their failings grow stronger. Power comes not from your Rolodex but from how many bloggers link to you - and everyone trembles before search engine rankings.

Power, it seems, comes from public information. Secrets are only a source of powerlessness. Just ask Brad Abrams, who poses this rhetorical question:

If no one knows you did X, did you really get all the benefits for doing X?

I think Brad is being a bit too cautious here. I'll go one step further. Until you've..

  • Written a blog entry about X
  • Posted Flickr photos of X
  • Uploaded a video of X to YouTube
  • Typed a Twitter message about X

.. did X really happen at all?

This is not to say we should fill the world with noise on every mundane aspect of our existence. But who decides what is mundane? Who decides what is interesting? Everything's interesting to someone, even if that someone is only you and a few other people in the world.

It's my firm belief that the inclusionists are winning. We live in a world of infinitely searchable micro-content, and every contribution, however small, enriches all of us. But more selfishly, if you're interested in deriving maximum benefit from your work, there's no substitute for making it public and findable. Obscurity sucks. But obscurity by choice is irrational. When in doubt, make it public.

Discussion

Reddit: Language vs. Platform

My previous entry, Twitter: Service vs. Platform, was widely misunderstood. I suppose I only have myself to blame, so I'll try to clarify with another example.

Consider Reddit. The Reddit development team switched from Lisp to Python late in 2005:

If Lisp is so great, why did we stop using it? One of the biggest issues was the lack of widely used and tested libraries. Sure, there is a CL library for basically any task, but there is rarely more than one, and often the libraries are not widely used or well documented. Since we're building a site largely by standing on the shoulders of others, this made things a little tougher. There just aren't as many shoulders on which to stand.

On that note, if you have been considering writing a web application in Lisp, go for it. It will be tough if you're not already a Lisper, but you will learn a lot along the way, and it will be worth it I am sure. Lisp is especially great for projects where the end goal is unknown because it's so easy to steer in different directions. Lisp will never get in your way, although sometimes the environment will.

Language performance is a red herring. That's especially true when we're comparing dynamic languages like Ruby, Lisp, and Python that will never be known for their high octane, nitro burnin' performance levels. I assumed Alex Payne knew that when he chose to specifically call out Ruby language performance, but maybe I assumed wrong.

When you choose a language, like it or not, you've chosen a platform. And as Steve so patiently and calmly explained to all the Lisp enthusiasts, the platform around the language, more than the language itself, sets the tone for your development experience. The availability of common, popular libraries and the maturity of the development environment end up trumping any particular significance the language holds.

That's why the Reddit switch makes good business sense: they didn't change languages; they changed platforms. At the point which your choice of platform starts to jeopardize your service, you switch platforms, exactly as Reddit did. Your users don't give a damn what framework and language you're using. The only people who care about that stuff are other software developers. And God help you if your users are software developers; then you're really in trouble.

But things aren't all roses in Python-land either. The Reddit developers initially used a Rails-like web application framework, with decidedly mixed results:

The framework that seems most promising is Django and indeed the authors of reddit initially attempted to rewrite their site in it. I was curious about their experience, so I carefully followed them along, trying to help them out.

Django seemed great from the outside: a nice-looking website, intelligent and talented developers, and a seeming surplus of nice features. The developers and community are extremely helpful and responsive to patches and suggestions. And all the right goals are espoused in their philosophy documents and FAQs. Unfortunately, however, they seem completely incapable of living up to them.

While Django claims that it's "loosely coupled", using it pretty much requires fitting your code into Django's worldview. Django insists on executing your code itself, either through its command-line utility or a specialized server handler called with the appropriate environment variables and Python path. When you start a project, by default Django creates folders nested four levels deep for your code and while you can move around some files, I had trouble figuring out which ones and how.

Django's philosophy says "Explicit is better than implicit", but Django has all sorts of magic. Database models you create in one file magically appear someplace else deep inside the Django module with a different name. When your model function is called, new things have been added to its variable-space and old ones removed. (I'm told they're currently working on fixing both of these, though.)

Note that any analogies I'm drawing between Rails and Django here are purely intentional.

Not that there's anything wrong with adopting a web application framework. But at least in Python you have a choice of web application frameworks. Instead of investing in the Django worldview, the Reddit team decided that the lighter weight web.py better suited their needs. Similarly, some ASP.NET developers reject the entire page lifecycle model, preferring to write their own HttpHandlers and HttpModules for finer-grained control over what's happening on their website. And that's fine; the ASP.NET platform accommodates both camps of developers.

It's true that Twitter represents an extreme case, but it sure looks like the Twitter developers could benefit from a choice of web application frameworks, too. In the end, it's about choice and flexibility. Not just in the language, but in the platform that inevitably comes along with any language.

Discussion

Twitter: Service vs. Platform

Twitter is a victim of its own success. The site has massive scaling problems, to the tune of 11,000 pageviews per second. According to this interview with a Twitter developer, a lot of the scaling problems are attributable to Twitter's choice of platform:

By various metrics Twitter is the biggest Rails site on the net right now. Running on Rails has forced us to deal with scaling issues - issues that any growing site eventually contends with - far sooner than I think we would on another framework.

The common wisdom in the Rails community at this time is that scaling Rails is a matter of cost: just throw more CPUs at it. The problem is that more instances of Rails (running as part of a Mongrel cluster, in our case) means more requests to your database. At this point in time there's no facility in Rails to talk to more than one database at a time. The solutions to this are caching the hell out of everything and setting up multiple read-only slave databases, neither of which are quick fixes to implement. So it's not just cost, it's time, and time is that much more precious when people can['t] reach your site.

None of these scaling approaches are as fun and easy as developing for Rails. All the convenience methods and syntactical sugar that makes Rails such a pleasure for coders ends up being absolutely punishing, performance-wise. Once you hit a certain threshold of traffic, either you need to strip out all the costly neat stuff that Rails does for you (RJS, ActiveRecord, ActiveSupport, etc.) or move the slow parts of your application out of Rails, or both.

It's also worth mentioning that there shouldn't be doubt in anybody's mind at this point that Ruby itself is slow. It's great that people are hard at work on faster implementations of the language, but right now, it's tough. If you're looking to deploy a big web application and you're language-agnostic, realize that the same operation in Ruby will take less time in Python. All of us working on Twitter are big Ruby fans, but I think it's worth being frank that this isn't one of those relativistic language issues. Ruby is slow.

I've often said that performance doesn't always matter. But if, like Twitter, your business model is predicated on how fast your users can press the Refresh button in their browser, you could be in serious trouble if your service becomes popular.

What I find particularly amusing is the performance comparison with Python. It's hard to believe that Python is that much faster than Ruby. Python, like Ruby, is an interpreted language, and interpreted languages are so slow that if you have to ask how much performance you're giving up, you can't afford it. Consider this chart from Code Complete 2.0:

Language Type of Language Execution Time Relative to C++
C++ Compiled 1:1
Visual Basic Compiled 1:1
C# Compiled 1:1
Java Byte code 1.5:1
PHP Interpreted > 100:1
Python Interpreted > 100:1

I realize that Web 2.0 is built on the back of the cheap "whatever box" server. Twitter is probably the perfect storm of refresh-heavy design coupled with exponential growth. Most websites wish they were so lucky.

To be fair, it sounds like most of Twitter's problems are database problems, so maybe it doesn't matter what language they use. But it does make you wonder: what's more important-- the service, or the platform you deliver that service on?

In the case where the latter is jeopardizing the former, I think it's pretty clear where your allegiances should lie. Your users don't care how cool the Rails platform is-- but they sure do care about consistent availability of your service.

Update: This entry isn't as clear as it could be. See my followup to this post for a better explanation of my position.

Discussion

The Pernicious Issue of Software Patents

A reddit user recently invoked link necromancy on a 1994 Donald Knuth letter to the U.S. Patent Office:

When I think of the computer programs I require daily to get my own work done, I cannot help but realize that none of them would exist today if software patents had been prevalent in the 1960s and 1970s. Changing the rules now will have the effect of freezing progress at essentially its current level. If present trends continue, the only recourse available to the majority of America's brilliant software developers will be to give up software or to emigrate. The U.S.A. will soon lose its dominant position.

Please do what you can to reverse this alarming trend. There are far better ways to protect the intellectual property rights of software developers than to take away their right to use fundamental building blocks.

You have to respect the opinion of Donald Knuth, because he's our homeboy.

Knuth is my Homeboy

Still, opinions vary. The software patent debate merits an entire Wikipedia article, and the ensuing comment debate on Reddit represents plenty of opposing viewpoints.

Paul Graham, surprisingly, thinks software patents don't matter:

I'm not saying secrecy would be worse than patents, just that we couldn't discard patents for free. Businesses would become more secretive to compensate, and in some fields this might get ugly. Nor am I defending the current patent system. There is clearly a lot that's broken about it. But the breakage seems to affect software less than most other fields.

In the software business I know from experience whether patents encourage or discourage innovation, and the answer is the type that people who like to argue about public policy least like to hear: they don't affect innovation much, one way or the other. Most innovation in the software business happens in startups, and startups should simply ignore other companies' patents. At least, that's what we advise, and we bet money on that advice.

Paul Heckel goes so far as to say responsible, rational use of software patents may actually encourage innovation:

In brief, what superficially looks like another problem to be dealt with in the increasingly competitive, commodities oriented software business, will prove to be what makes products less price competitive. Many industries have worked on this basis all along: patents make industries more diverse in their offerings, more profitable, more innovative, and ultimately will make the U.S. more competitive.

The essence of this article is simple: Software intellectual property issues are not inherently different in substance from other technologies; what motivates people is not inherently different; industry life cycle is not inherently different; marketing and business strategies and tactics are not inherently different; the law and policy issues are not inherently different. The technology is not even new. Software has been around for 40 years. The issues may be new to those who had no experience of them, but the only thing that is different is that software is a mass market industry for the first time and real money is at stake.

As much as I respect Knuth, I have to agree that the problem with software patents isn't the patents themselves. It's the sloppy, haphazard way the patents are granted and enforced. If anything needs reforming, it's the U.S. Patent Office.

Discussion

Usability Is Timeless

Jakob Nielsen's new book, Prioritizing Web Usability, is a worthy companion to the previous two. Now it's a trilogy:

  1. Designing Web Usability: The Practice of Simplicity (2000)
  2. Homepage Usability: 50 Websites Deconstructed (2002)
  3. Prioritizing Web Usability (2006)

You can tell Jakob and his co-authors are growing ever more skilled at the practice of simplicity; this book is the first in the series to drop the colon and subtitle.

Prioritizing Web Usability book cover

The very existence of the new, updated book hints that usability guidelines evolve over time. In one of the first chapters, Nielsen makes this explicit by revisiting earlier web usability issues to see how much they've improved in seven years. Each usability issue is rated from zero to three skulls to indicate how severe the problem is today:

1. Usability issues that are still major problems today

XXXLinks that don't change color when visited
XXXBreaking the back button
XXXOpening new browser windows
XXXPop-up windows
XXXDesign elements that look like advertisements
XXXViolating web-wide conventions
XXXVaporous content and empty hype
XXXDense content and unscannable text

2. Usability issues that are less important due to improvements in technology

XSlow download times
XFrames
XXAdobe Flash content
XXLow-relevancy search listings
XXMultimedia and long videos
XXFrozen layouts
XXCross-platform incompatibility

3. Usability issues that are less important because users have adapted to the web

XUncertain clickability
XXScrolling
XRegistration
XXComplex URLs
XPull-down and cascading menus

4. Usability issues that are less important because designers have learned restraint

XPlug-ins and bleeding edge technology
X3D user interfaces
XBloated design
XSplash pages
XMoving graphics and scrolling text
XXCustom GUI widgets
XNot disclosing who's behind information
XMade-up words
XXOutdated content
XXInconsistency within a web site
XXPremature requests for personal information
XXMultiple sites

When comparing the severity of these 34 usability issues with their historical severity in 2000, Nielsen notes that most of the progress can be attributed to designers learning restraint:

Resolved by user adaptation11%
Resolved by advances in technology10%
Resolved by designer restraint21%
Still an issue58%

Relying on user education or technology fixes to address usability issues means you'll be waiting a long time. Most of the immediate benefits are realized by designers who learn to follow usability guidelines. But designers are fallible, too, so there's no guarantee these problems won't crop up again later, or in slightly different forms.

The data presented in Prioritizing Web Usability shows that usability guidelines do evolve over time, but slowly. It also illustrates how the core principles of usability are timeless:

From 1984 to 1986, the U.S. Air Force compiled existing usability knowledge into a single, well-organized set of guidelines for its user interface designers called Guidelines for Desigining User Interface Software, ESD-TR-86-278 (also available as a pdf). Jakob Nielsen was one of several people who advised the undertaking. The project identified 944 guidelines, most of them related to military command and control systems built in the 1970s and early 1980s, which used mainframe technology.

You might think that these old findings would be irrelevant to today's designers. If so, you'd be wrong. As an experiment, we retested 60 of the 1986 guidelines in 2005. Of these, 54 continue to be valid today. Of the total 944 guidelines, we deduced that 10 percent are no longer valid and 20 percent are irrelevant because they relate to rarely used interface technologies. But nearly 70 percent of the orginal guidelines continue to be both correct and relevant 20 years later.

This is one of the reasons I urge software developers to study and understand the principles of usability. It's one of the precious few bodies of knowledge in a developer's toolkit that will still be useful twenty years from now.

Discussion