Coding Horror

programming and human factors

It's a Malformed World

Bill de hra recently highlighted a little experiment Ian Hickson ran in August:

I did a short study recently checking only for syntax errors in HTML documents, and the results were that of the 667416 files tested, 626575 had syntax errors. Over 93%. That's only syntax errors in the HTML, not checking the CSS, the content types, the semantic errors (e.g. duplicate IDs -- 86461 of those files had duplicated IDs), or any other errors.

html-validation-pie-chart.png

If you included those kinds of errors, you'd probably find that almost all pages had errors that would trigger this warning. Thus any sort of visible UI would be basically always saying "this page is broken". That would not be good UI for the majority of users, who don't care.

Even Tim-Berners Lee, the godfather of the Web, acknowledges that the move to enforce well-formedness on the web with XHTML has failed:

Some things are clearer with hindsight of several years. It is necessary to evolve HTML incrementally. The attempt to get the world to switch to XML, including quotes around attribute values and slashes in empty tags and namespaces all at once didn't work. The large HTML-generating public did not move, largely because the browsers didn't complain. Some large communities did shift and are enjoying the fruits of well-formed systems, but not all. It is important to maintain HTML incrementally, as well as continuing a transition to well-formed world, and developing more power in that world.

Perhaps this is why there's 63 HTML validation errors on Google's homepage right now. Like it or not, we live in a world of malformed HTML. Browsers aren't compilers. They don't fail spectacularly when they encounter invalid markup. And nor should they. HTML is, and always has been, tolerant by design. We'll always be awash in a sea of tag soup.

Your browser doesn't care if your HTML is well-formed. Your users don't care if your HTML is well-formed. So why should you?

Written by Jeff Atwood

Indoor enthusiast. Co-founder of Stack Overflow and Discourse. Disclaimer: I have no idea what I'm talking about. Find me here: https://infosec.exchange/@codinghorror