Coding Horror

programming and human factors

Your Password is Too Damn Short

I'm a little tired of writing about passwords. But like taxes, email, and pinkeye, they're not going away any time soon. Here's what I know to be true, and backed up by plenty of empirical data:

  • No matter what you tell them, users will always choose simple passwords.

  • No matter what you tell them, users will re-use the same password over and over on multiple devices, apps, and websites. If you are lucky they might use a couple passwords instead of the same one.

What can we do about this as developers?

  • Stop requiring passwords altogether, and let people log in with Google, Facebook, Twitter, Yahoo, or any other valid form of Internet driver's license that you're comfortable supporting. The best password is one you don't have to store.

  • Urge browsers to support automatic, built-in password generation and management. Ideally supported by the OS as well, but this requires cloud storage and everyone on the same page, and that seems most likely to me per-browser. Chrome, at least, is moving in this direction.

  • Nag users at the time of signup when they enter passwords that are …

    • Too short: UY7dFd

    • Lack sufficient entropy: aaaaaaaaa

    • Match common dictionary words: anteaters1

This is commonly done with an ambient password strength meter, which provides real time feedback as you type.

If you can't avoid storing the password – the first two items I listed above are both about avoiding the need for the user to select a 'new' password altogether – then showing an estimation of password strength as the user types is about as good as it gets.

The easiest way to build a safe password is to make it long. All other things being equal, the law of exponential growth means a longer password is a better password. That's why I was always a fan of passphrases, though they are exceptionally painful to enter via touchscreen in our brave new world of mobile – and that is an increasingly critical flaw. But how short is too short?

When we built Discourse, I had to select an absolute minimum password length that we would accept. I chose a default of 8, based on what I knew from my speed hashing research. An eight character password isn't great, but as long as you use a reasonable variety of characters, it should be sufficiently resistant to attack.

By attack, I don't mean an attacker automating a web page or app to repeatedly enter passwords. There is some of this, for extremely common passwords, but that's unlikely to be a practical attack on many sites or apps, as they tend to have rate limits on how often and how rapidly you can try different passwords.

What I mean by attack is a high speed offline attack on the hash of your password, where an attacker gains access to a database of leaked user data. This kind of leak happens all the time. And it will continue to happen forever.

If you're really unlucky, the developers behind that app, service, or website stored the password in plain text. This thankfully doesn't happen too often any more, thanks to education efforts. Progress! But even if the developers did properly store a hash of your password instead of the actual password, you better pray they used a really slow, complex, memory hungry hash algorithm, like bcrypt. And that they selected a high number of iterations. Oops, sorry, that was written in the dark ages of 2010 and is now out of date. I meant to say scrypt. Yeah, scrypt, that's the ticket.

Then we're safe? Right? Let's see.

You might read this and think that a massive cracking array is something that's hard to achieve. I regret to inform you that building an array of, say, 24 consumer grade GPUs that are optimized for speed hashing, is well within the reach of the average law enforcement agency and pretty much any small business that can afford a $40k equipment charge. No need to buy when you can rent – plenty of GPU equipped cloud servers these days. Beyond that, imagine what a motivated nation-state could bring to bear. The mind boggles.

Even if you don't believe me, but you should, the offline fast attack scenario, much easier to achieve, was hardly any better at 37 minutes.

Perhaps you're a skeptic. That's great, me too. What happens when we try a longer random.org password on the massive cracking array?

9 characters2 minutes
10 characters2 hours
11 characters6 days
12 characters1 year
13 characters64 years

The random.org generator is "only" uppercase, lowercase, and number. What if we add special characters, to keep Q*Bert happy?

8 characters1 minute
9 characters2 hours
10 characters1 week
11 characters2 years
12 characters2 centuries

That's a bit better, but you can't really feel safe until the 12 character mark even with a full complement of uppercase, lowercase, numbers, and special characters.

It's unlikely that massive cracking scenarios will get any slower. While there is definitely a password length where all cracking attempts fall off an exponential cliff that is effectively unsurmountable, these numbers will only get worse over time, not better.

So after all that, here's what I came to tell you, the poor, beleagured user:

Unless your password is at least 12 characters, you are vulnerable.

That should be the minimum password size you use on any service. Generate your password with some kind of offline generator, with diceware, or your own home-grown method of adding words and numbers and characters together – whatever it takes, but make sure your passwords are all at least 12 characters.

Now, to be fair, as I alluded to earlier all of this does depend heavily on the hashing algorithm that was selected. But you have to assume that every password you use will be hashed with the lamest, fastest hash out there. One that is easy for GPUs to calculate. There's a lot of old software and systems out there, and will be for a long, long time.

And for developers:

  1. Pick your new password hash algorithms carefully, and move all your old password hashing systems to much harder to calculate hashes. You need hashes that are specifically designed to be hard to calculate on GPUs, like scrypt.

  2. Even if you pick the "right" hash, you may be vulnerable if your work factor isn't high enough. Matsano recommends the following:

  • scrypt: N=2^14, r=8, p=1

  • bcrypt: cost=11

  • PBKDF2 with SHA256: iterations=86,000

But those are just guidelines; you have to scale the hashing work to what's available and reasonable on your servers or devices. For example, we had a minor denial of service bug in Discourse where we allowed people to enter up to 20,000 character passwords in the login form, and calculating the hash on that took, uh … several seconds.

Now if you'll excuse me, I need to go change my PayPal password.

[advertisement] What's your next career move? Stack Overflow Careers has the best job listings from great companies, whether you're looking for opportunities at a startup or Fortune 500. You can search our job listings or create a profile and let employers find you.
Discussion

Given Enough Money, All Bugs Are Shallow

Eric Raymond, in The Cathedral and the Bazaar, famously wrote

Given enough eyeballs, all bugs are shallow.

The idea is that open source software, by virtue of allowing anyone and everyone to view the source code, is inherently less buggy than closed source software. He dubbed this "Linus's Law".

Insofar as it goes, I believe this is true. When only the 10 programmers who happen to work at your company today can look at your codebase, it's unlikely to be as well reviewed as a codebase that's public to the world's scrutiny on GitHub.

However, the Heartbleed SSL vulnerability was a turning point for Linus's Law, a catastrophic exploit based on a severe bug in open source software. How catastrophic? It affected about 18% of all the HTTPS websites in the world, and allowed attackers to view all traffic to these websites, unencrypted... for two years.

All those websites you thought were secure? Nope. This bug went unnoticed for two full years.

Two years!

OpenSSL, the library with this bug, is one of the most critical bits of Internet infrastructure the world has – relied on by major companies to encrypt the private information of their customers as it travels across the Internet. OpenSSL was used on millions of servers and devices to protect the kind of important stuff you want encrypted, and hidden away from prying eyes, like passwords, bank accounts, and credit card information.

This should be some of the most well-reviewed code in the world. What happened to our eyeballs, man?

In reality, it's generally very, very difficult to fix real bugs in anything but the most trivial Open Source software. I know that I have rarely done it, and I am an experienced developer. Most of the time, what really happens is that you tell the actual programmer about the problem and wait and see if he/she fixes it – Neil Gunton

Even if a brave hacker communities to read the code, they're not terribly likely to spot one of the hard-to-spot problems. Why? Few open source hackers are security experts. – Jeremy Zawodny

The fact that many eyeballs are looking at a piece of software is not likely to make it more secure. It is likely, however, to make people believe that it is secure. The result is an open source community that is probably far too trusting when it comes to security. – John Viega

I think there are a couple problems with Linus's Law:

  1. There's a big difference between usage eyeballs and development eyeballs. Just because you pull down some binaries in a RPM, or compile something in Linux, or even report bugs back to the developers via their bug tracker, doesn't mean you're doing anything at all to contribute to the review of the underlying code. Most eyeballs are looking at the outside of the code, not the inside. And while you can discover bugs, even important security bugs, through usage, the hairiest security bugs require inside knowledge of how the code works.

  2. The act of writing (or cut-and-pasting) your own code is easier than understanding and peer reviewing someone else's code. There is a fundamental, unavoidable asymmetry of work here. The amount of code being churned out today – even if you assume only a small fraction of it is "important" enough to require serious review – far outstrips the number of eyeballs available to look at the code. (Yes, this is another argument in favor of writing less code.)

  3. There are not enough qualified eyeballs to look at the code. Sure, the overall number of programmers is slowly growing, but what percent of those programmers are skilled enough, and have the right security background, to be able to audit someone else's code effectively? A tiny fraction.

Even if the code is 100% open source, utterly mission critical, and used by major companies in virtually every public facing webserver for customer security purposes, we end up with critical bugs that compromise everyone. For two years!

That's the lesson. If we can't naturally get enough eyeballs on OpenSSL, how does any other code stand a chance? What do we do? How do we get more eyeballs?

The short term answer was:

These are both very good things and necessary outcomes. We should be doing this for all the critical parts of the open source ecosystem people rely on.

But what's the long term answer to the general problem of not enough eyeballs on open source code? It's something that will sound very familar to you, though I suspect Eric Raymond won't be too happy about it.

Money. Lots and lots of money.

Increasingly, companies are turning to commercial bug bounty programs. Either ones they create themselves, or run through third party services like Bugcrowd, Synack, HackerOne, and Crowdcurity. This means you pay per bug, with a larger payout the bigger and badder the bug is.

Or you can attend a yearly event like Pwn2Own, where there's a yearly contest and massive prizes, as large as hundreds of thousands of dollars, for exploiting common software. Staging a big annual event means a lot of publicity and interest, attracting the biggest guns.

That's the message. If you want to find bugs in your code, in your website, in your app, you do it the old fashioned way: by paying for them. You buy the eyeballs.

While I applaud any effort to make things more secure, and I completely agree that security is a battle we should be fighting on multiple fronts, both commercial and non-commercial, I am uneasy about some aspects of paying for bugs becoming the new normal. What are we incentivizing, exactly?

Money makes security bugs go underground

There's now a price associated with exploits, and the deeper the exploit and the lesser known it is, the more incentive there is to not tell anyone about it until you can collect a major payout. So you might wait up to a year to report anything, and meanwhile this security bug is out there in the wild – who knows who else might have discovered it by then?

If your focus is the payout, who is paying more? The good guys, or the bad guys? Should you hold out longer for a bigger payday, or build the exploit up into something even larger? I hope for our sake the good guys have the deeper pockets, otherwise we are all screwed.

I like that Google addressed a few of these concerns by making Pwnium, their Chrome specific variant of Pwn2Own, a) no longer a yearly event but all day, every day and b) increasing the prize money to "infinite". I don't know if that's enough, but it's certainly going in the right direction.

Money turns security into a "me" goal instead of an "us" goal

I first noticed this trend when one or two people reported minor security bugs in Discourse, and then seemed to hold out their hand, expectantly. (At least, as much as you can do something like that in email.) It felt really odd, and it made me uncomfortable.

Am I now obligated, on top of providing a completely free open source project to the world, to pay people for contributing information about security bugs that make this open source project better? Believe me, I was very appreciative of the security bug reporting, and I sent them whatever I could, stickers, t-shirts, effusive thank you emails, callouts in the code and checkins. But open source isn't supposed to be about the money… is it?

Perhaps the landscape is different for closed-source, commercial products, where there's no expectation of quid pro quo, and everybody already pays for the service directly or indirectly anyway.

No Money? No Security.

If all the best security researchers are working on ever larger bug bounties, and every major company adopts these sorts of bug bounty programs, what does that do to the software industry?

It implies that unless you have a big budget, you can't expect to have great security, because nobody will want to report security bugs to you. Why would they? They won't get a payday. They'll be looking elsewhere.

A ransomware culture of "pay me or I won't tell you about your terrible security bug" does not feel very far off, either. We've had mails like that already.

Easy money attracts all skill levels

One unfortunate side effect of this bug bounty trend is that it attracts not just bona fide programmers interested in security, but anyone interested in easy money.

We've gotten too many "serious" security bug reports that were extremely low value. And we have to follow up on these, because they are "serious", right? Unfortunately, many of them are a waste of time, because …

  • The submitter is more interested in scaring you about the massive, critical security implications of this bug than actually providing a decent explanation of the bug, so you'll end up doing all the work.

  • The submitter doesn't understand what is and isn't an exploit, but knows there is value in anything resembling an exploit, so submits everything they can find.

  • The submitter can't share notes with other security researchers to verify that the bug is indeed an exploit, because they might "steal" their exploit and get paid for it before they do.

  • The submitter needs to convince you that this is an exploit in order to get paid, so they will argue with you about this. At length.

The incentives feel really wrong to me. As much as I know security is incredibly important, I view these interactions with an increasing sense of dread because they generate work for me and the returns are low.

What can we do?

Fortunately, we all have the same goal: make software more secure.

So we should view bug bounty programs as an additional angle of attack, another aspect of "defense in depth", perhaps optimized a bit more for commercial projects where there is ample money. And that's OK.

But I have some advice for bug bounty programs, too:

  • You should have someone vetting these bug reports, and making sure they are credible, have clear reproduction steps, and are repeatable, before we ever see them.

  • You should build additional incentives in your community for some kind of collaborative work towards bigger, better exploits. These researchers need to be working together in public, not in secret against each other.

  • You should have a reputation system that builds up so that only the better, proven contributors are making it through and submitting reports.

  • Encourage larger orgs to fund bug bounties for common open source projects, not just their own closed source apps and websites. At Stack Exchange, we donated to open source projects we used every year. Donating a bug bounty could be a big bump in eyeballs on that code.

I am concerned that we may be slowly moving toward a world where given enough money, all bugs are shallow. Money does introduce some perverse incentives for software security, and those incentives should be watched closely.

But I still believe that the people who will freely report security bugs in open source software because

  • It is the right thing to do™

and

  • They want to contribute back to open source projects that have helped them, and the world

… will hopefully not be going away any time soon.

[advertisement] How are you showing off your awesome? Create a Stack Overflow Careers profile and show off all of your hard work from Stack Overflow, Github, and virtually every other coding site. Who knows, you might even get recruited for a great new position!
Discussion

Toward a Better Markdown Tutorial

It's always surprised me when people, especially technical people, say they don't know Markdown. Do you not use GitHub? Stack Overflow? Reddit?

I get that an average person may not understand how Markdown is based on simple old-school plaintext ASCII typing conventions. Like when you're *really* excited about something, you naturally put asterisks around it, and Markdown makes that automagically italic.

But how can we expect them to know that, if they grew up with wizzy-wig editors where the only way to make italic is to click a toolbar button, like an animal?

I am not advocating for WYSIWYG here. While there's certainly more than one way to make italic, I personally don't like invisible formatting tags and I find that WYSIWYG is more like WYCSYCG in practice. It's dangerous to be dependent on these invisible formatting codes you can't control. And they're especially bad if you ever plan to care about differences, revisions, and edit history. That's why I like to teach people simple, visible formatting codes.

We can certainly debate which markup language is superior, but in Discourse we tried to build a rainbow tool that satisifies everyone. We support:

  • HTML (safe subset)
  • BBCode (basic subset)
  • Markdown (full)

This makes coding our editor kind of hellishly complex, but it means that for you, the user, whatever markup language you're used to will probably "just work" on any Discourse site you happen to encounter in the future. But BBCode and HTML are supported mostly as bridges. What we view as our primary markup format, and what we want people to learn to use, is Markdown.

However, one thing I have really struggled with is that there isn't any single great place to refer people to with a simple walkthrough and explanation of Markdown.

When we built Stack Overflow circa 2008-2009, I put together my best effort at the time which became the "editing help" page:

It's just OK. And GitHub has their Markdown Basics, and GitHub Flavored Markdown help pages. They're OK.

The Ghost editor I am typing this in has an OK Markdown help page too.

But none of these are great.

What we really need is a great Markdown tutorial and reference page, one that we can refer anyone to, anywhere in the world, from someone who barely touches computers to the hardest of hard-core coders. I don't want to build another one for these kinds of help pages for Discourse, I want to build one for everyone. Since it is for everyone, I want to involve everyone. And by everyone, I mean you.

After writing about Our Programs Are Fun To Use – which I just updated with a bunch of great examples contributed in the comments, so go check that out even if you read it already – I am inspired by the idea that we can make a fun, interactive Markdown tutorial together.

So here's what I propose: a small contest to build an interactive Markdown tutorial and reference, which we will eventually host at the home page of commonmark.org, and can be freely mirrored anywhere in the world.

Some ground rules:

  • It should be primarily in JavaScript and HTML. Ideally entirely so. If you need to use a server-side scripting language, that's fine, but try to keep it simple, and make sure it's something that is reasonable to deploy on a generic Linux server anywhere.

  • You can pick any approach you want, but it should be highly interactive, and I suggest that you at minimum provide two tracks:

    • A gentle, interactive tutorial for absolute beginners who are asking "what the heck does Markdown even mean?"

    • A dynamic, interactive reference for intermediates and experts who are asking more advanced usage questions, like "how do I make code inside a list, or a list inside a list?"

  • There's a lot of variance in Markdown implementations, so teach the most common parts of Markdown, and cover the optional / less common variations either in the advanced reference areas or in extra bonus sections. People do love their tables and footnotes! We recommend using a CommonMark compatible implementation, but it is not a requirement.

  • Your code must be MIT licensed.

  • Judging will be completely at the whim of myself and John MacFarlane. Our decisions will be capricious, arbitrary, probably nonsensical, and above all, final.

  • We'll run this contest for a period of one month, from today until April 28th, 2015.

  • If I have hastily left out any clarifying rules I should have had, they will go here.

Of course, the real reward for building is the admiration of your peers, and the knowledge that an entire generation of people will grow up learning basic Markdown skills through your contribution to a global open source project.

But on top of that, I am offering … fabulous prizes!

  1. Let's start with my Recommended Reading List. I count sixteen books on it. As long as you live in a place Amazon can ship to, I'll send you all the books on that list. (Or the equivalent value in an Amazon gift certificate, if you happen to have a lot of these books already, or prefer that.)

  2. Second prize is a CODE Keyboard. This can be shipped worldwide.

  3. Third prize is you're fired. Just kidding. Third prize is your choice of any three books on my reading list. (Same caveats around Amazon apply.)

Looking for a place to get started? Check out:

If you want privacy, you can mail your entries to me directly (see the about page here for my email address), or if you are comfortable with posting your contest entry in public, I'll create a topic on talk.commonmark for you to post links and gather feedback. Leaving your entry in the comments on this article is also OK.

We desperately need a great place that we can send everyone to learn Markdown, and we need your help to build it. Let's give this a shot. Surprise and amaze us!

Update: we selected winners in June 2015 and the final result is now permanently located at commonmark.org/help – enjoy!

[advertisement] Stack Overflow Careers matches the best developers (you!) with the best employers. You can search our job listings or create a profile and even let employers find you.
Discussion

Our Programs Are Fun To Use

These two imaginary guys influenced me heavily as a programmer.

Instead of guaranteeing fancy features or compatibility or error free operation, Beagle Bros software promised something else altogether: fun.

beagle-bros-our-programs-are-fun-to-use

Playing with the Beagle Bros quirky Apple II floppies in middle school and high school, and the smorgasboard of oddball hobbyist ephemera collected on them, was a rite of passage for me.

Here were a bunch of goofballs writing terrible AppleSoft BASIC code like me, but doing it for a living – and clearly having fun in the process. Apparently, the best way to create fun programs for users is to make sure you had fun writing them in the first place.

But more than that, they taught me how much more fun it was to learn by playing with an interactive, dynamic program instead of passively reading about concepts in a book.

That experience is another reason I've always resisted calls to add "intro videos", external documentation, walkthroughs and so forth.

One of the programs on these Beagle Bros floppies, and I can't for the life of me remember which one, or in what context this happened, printed the following on the screen:

One day, all books will be interactive and animated.

I thought, wow. That's it. That's what these floppies were trying to be! Interactive, animated textbooks that taught you about programming and the Apple II! Incredible.

This idea has been burned into my brain for twenty years, ever since I originally read it on that monochrome Apple //c screen. Imagine a world where textbooks didn't just present a wall of text to you, the learner, but actually engaged you, played with you, and invited experimentation. Right there on the page.

(Also, if you can find and screenshot the specific Beagle Bros program that I'm thinking of here, I'd be very grateful: there's a free CODE Keyboard with your name on it.)

Between the maturity of JavaScript, HTML 5, and the latest web browsers, you can deliver exactly the kind of interactive, animated textbook experience the Beagle Bros dreamed about in 1985 to billions of people with nothing more than access to the Internet and a modern web browser.

Here are a few great examples I've collected. Screenshots don't tell the full story, so click through and experiment.

As suggested in the comments, and also excellent:

(There are also native apps that do similar things; the well reviewed Earth Primer, for example. But when it comes to education, I'm not too keen on platform specific apps which seem replicable in common JavaScript and HTML.)

In the bad old days, we learned programming by reading books. But instead of reading this dry old text:

Now we can learn the same concepts interactively, by reading a bit, then experimenting with live code on the same page as the book, and watching the results as we type.

C'mon. Type something. See what happens.

I certainly want my three children to learn from other kids and their teachers, as humans have since time began. But I also want them to have access to a better class of books than I did. Books that are effectively programs. Interactive, animated books that let them play and experiment and create, not just passively read.

I want them to learn, as I did, that our programs are fun to use.

[advertisement] Stack Overflow Careers matches the best developers (you!) with the best employers. You can search our job listings or create a profile and even let employers find you.
Discussion

The God Login

I graduated with a Computer Science minor from the University of Virginia in 1992. The reason it's a minor and not a major is because to major in CS at UVa you had to go through the Engineering School, and I was absolutely not cut out for that kind of hardcore math and physics, to put it mildly. The beauty of a minor was that I could cherry pick all the cool CS classes and skip everything else.

One of my favorite classes, the one I remember the most, was Algorithms. I always told people my Algorithms class was the one part of my college education that influenced me most as a programmer. I wasn't sure exactly why, but a few years ago I had a hunch so I looked up a certain CV and realized that Randy Pausch – yes, the Last Lecture Randy Pausch – taught that class. The timing is perfect: University of Virginia, Fall 1991, CS461 Analysis of Algorithms, 50 students.

I was one of them.

No wonder I was so impressed. Pausch was an incredible, charismatic teacher, a testament to the old adage that your should choose your teacher first and the class material second, if you bother to at all. It's so true.

In this case, the combination of great teacher and great topic was extra potent, as algorithms are central to what programmers do. Not that we invent new algorithms, but we need to understand the code that's out there, grok why it tends to be fast or slow due to the tradeoffs chosen, and choose the correct algorithms for what we're doing. That's essential.

And one of the coolest things Mr. Pausch ever taught me was to ask this question:

What's the God algorithm for this?

Well, when sorting a list, obviously God wouldn't bother with a stupid Bubble Sort or Quick Sort or Shell Sort like us mere mortals, God would just immediately place the items in the correct order. Bam. One step. The ultimate lower bound on computation, O(1). Not just fixed time, either, but literally one instantaneous step, because you're freakin' God.

This kind of blew my mind at the time.

I always suspected that programmers became programmers because they got to play God with the little universe boxes on their desks. Randy Pausch took that conceit and turned it into a really useful way of setting boundaries and asking yourself hard questions about what you're doing and why.

So when we set out to build a login dialog for Discourse, I went back to what I learned in my Algorithms class and asked myself:

How would God build this login dialog?

And the answer is, of course, God wouldn't bother to build a login dialog at all. Every user would already be logged into GodApp the second they loaded the page because God knows who they are. Authoritatively, even.

This is obviously impossible for us, because God isn't one of our investors.

But.. how close can we get to the perfect godlike login experience in Discourse? That's a noble and worthy goal.

Wasn't it Bill Gates who once asked why the hell every programmer was writing the same File Open dialogs over and over? It sure feels that way for login dialogs. I've been saying for a long time that the best login is no login at all and I'm a staunch supporter of logging in with your Internet Driver's license whenever possible. So we absolutely support that, if you've configured it.

But today I want to focus on the core, basic login experience: user and password. That's the default until you configure up the other methods of login.

A login form with two fields, two buttons, and a link on it seems simple, right? Bog standard. It is, until you consider all the ways the simple act of logging in with those two fields can go wrong for the user. Let's think.

Let the user enter an email to log in

The critical fault of OpenID, as much as I liked it as an early login solution, was its assumption that users could accept an URL as their "identity". This is flat out crazy, and in the long run this central flawed assumption in OpenID broke it as a future standard.

User identity is always email, plain and simple. What happens when you forget your password? You get an email, right? Thus, email is your identity. Some people even propose using email as the only login method.

It's fine to have a username, of course, but always let users log in with either their username or their email address. Because I can tell you with 100% certainty that when those users forget their password, and they will, all the time, they'll need that email anyway to get a password reset. Email and password are strongly related concepts and they belong together. Always!

(And a fie upon services that don't allow me to use my email as a username or login. I'm looking at you, Comixology.)

Tell the user when their email doesn't exist

OK, so we know that email is de-facto identity for most people, and this is a logical and necessary state of affairs. But which of my 10 email addresses did I use to log into your site?

This was the source of a long discussion at Discourse about whether it made sense to reveal to the user, when they enter an email address in the "forgot password" form, whether we have that email address on file. On many websites, here's the sort of message you'll see after entering an email address in the forgot password form:

If an account matches name@example.com, you should receive an email with instructions on how to reset your password shortly.

Note the coy "if" there, which is a hedge against all the security implications of revealing whether a given email address exists on the site just by typing it into the forgot password form.

We're deadly serious about picking safe defaults for Discourse, so out of the box you won't get exploited or abused or overrun with spammers. But after experiencing the real world "which email did we use here again?" login state on dozens of Discourse instances ourselves, we realized that, in this specific case, being user friendly is way more important than being secure.

The new default is to let people know when they've entered an email we don't recognize in the forgot password form. This will save their sanity, and yours. You can turn on the extra security of being coy about this, if you need it, via a site setting.

Let the user switch between Log In and Sign Up any time

Many websites have started to show login and signup buttons side by side. This perplexed me; aren't the acts of logging in and signing up very different things?

Well, from the user's perspective, they don't appear to be. This Verge login dialog illustrates just how close the sign up and log in forms really are. Check out this animated GIF of it in action.

We've acknowledged that similarity by having either form accessible at any time from the two buttons at the bottom of the form, as a toggle:

And both can be kicked off directly from any page via the Sign Up and Log In buttons at the top right:

Pick common words

That's the problem with language, we have so many words for these concepts:

  • Sign In
  • Log In
  • Sign Up
  • Register
  • Join <site>
  • Create Account
  • Get Started
  • Subscribe

Which are the "right" ones? User research data isn't conclusive.

I tend to favor the shorter versions when possible, mostly because I'm a fan of the whole brevity thing, but there are valid cases to be made for each depending on the circumstances and user preferences.

Sign In may be slightly more common, though Log In has some nautical and historical computing basis that makes it worthy:

A couple of years ago I did a survey of top websites in the US and UK and whether they used “sign in”, “log in”, “login”, “log on”, or some other variant. The answer at the time seemed to be that if you combined “log in” and “login”, it exceeded “sign in”, but not by much. I’ve also noticed that the trend toward “sign in” is increasing, especially with the most popular services. Facebook seems to be a “log in” hold-out.

Work with browser password managers

Every login dialog you create should be tested to work with the default password managers in …

At an absolute minimum. Upon subsequent logins in that browser, you should see the username and password automatically autofilled.

Users rely on these default password managers built into the browsers they use, and any proper modern login form should respect that, and be designed sensibly, e.g. the password field should have type="password" in the HTML and a name that's readily identifable as a password entry field.

There's also LastPass and so forth, but I generally assume if the login dialog works with the built in browser password managers, it will work with third party utilities, too.

Handle common user mistakes

Oops, the user is typing their password with caps lock on? You should let them know about that.

Oops, the user entered their email as name@gmal.com instead of name@gmail.com? Or name@hotmail.cm instead of name@hotmail.com? You should either fix typos in common email domains for them, or let them know about that.

(I'm also a big fan of native browser "reveal password" support for the password field, so the user can verify that she typed in or autofilled the password she expects. Only Internet Explorer and I think Safari offer this, but all browsers should.)

Help users choose better passwords

There are many schools of thought on forcing helping users choose passwords that aren't unspeakably awful, e.g. password123 and iloveyou and so on.

There's the common password strength meter, which updates in real time as you type in the password field.

It's clever idea, but it gets awful preachy for my tastes on some sites. The implementation also leaves a lot to be desired, as it's left up to the whims of the site owner to decide what password strength means. One site's "good" is another site's "get outta here with that Fisher-Price toy password". It's frustrating.

So, with Discourse, rather than all that, I decided we'd default on a solid absolute minimum password length of 8 characters, and then verify the password to make sure it is not one of the 10,000 most common known passwords by checking its hash.

Don't forget the keyboard

I feel like keyboard users are a dying breed at this point, but for those of us that, when presented with a login dialog, like to rapidly type

name@example.com, tab, p4$$w0rd, enter

please verify that this works as it should. Tab order, enter to submit, etcetera.

Rate limit all the things

You should be rate limiting everything users can do, everywhere, and that's especially true of the login dialog.

If someone forgets their password and makes 3 attempts to log in, or issues 3 forgot password requests, that's probably OK. But if someone makes a thousand attempts to log in, or issues a thousand forgot password requests, that's a little weird. Why, I might even venture to guess they're possibly … not human.

You can do fancy stuff like temporarily disable accounts or start showing a CAPTCHA if there are too many failed login attempts, but this can easily become a griefing vector, so be careful.

I think a nice middle ground is to insert standard pauses of moderately increasing size after repeated sequential failures or repeated sequential forgot password requests from the same IP address. So that's what we do.

Stuff I forgot

I tried to remember everything we went through when we were building our ideal login dialog for Discourse, but I'm sure I forgot something, or could have been more thorough. Remember, Discourse is 100% open source and by definition a work in progress – so as my friend Miguel de Icaza likes to say, when it breaks, you get to keep both halves. Feel free to test out our implementation and give us your feedback in the comments, or point to other examples of great login experiences, or cite other helpful advice.

Logging in involves a simple form with two fields, a link, and two buttons. And yet, after reading all this, I'm sure you'll agree that it's deceptively complex. Your best course of action is not to build a login dialog at all, but instead rely on authentication from an outside source whenever you can.

Like, say, God.

[advertisement] How are you showing off your awesome? Create a Stack Overflow Careers profile and show off all of your hard work from Stack Overflow, Github, and virtually every other coding site. Who knows, you might even get recruited for a great new position!
Discussion