Whatever Happened to the META Tag?

When was the last time you saw a HTML header like this?

<head>
<title>GUID World</title>
<meta name="description"
content="Everything you wanted to know about GUIDs but were afraid to ask">
<meta name="keywords"
content="GUID, UUID, globally unique identifiers, 128-bit">
</head>

The web is a metadata-free zone. It's widely known that Google completely ignores metadata in its indexes. The <meta> tag has fallen so far out of favor that it drags the whole concept of metadata down with it. And perhaps rightfully so. Cory Doctorow viciously deconstructs metadata in Metacrap: Putting the torch to seven straw-men of the meta-utopia:

There are at least seven insurmountable obstacles between the world as we know it and meta-utopia. I'll enumerate them below:.

1. People lie

Metadata exists in a competitive world. Suppliers compete to sell their goods, cranks compete to convey their crackpot theories (mea culpa), artists compete for audience. Attention-spans and wallets may not be zero-sum, but they're damned close. That's why:

  • A search for any commonly referenced term at a search-engine like Altavista will often turn up at least one porn link in the first ten results.
  • Your mailbox is full of spam with subject lines like "Re: The information you requested."
  • Publisher's Clearing House sends out advertisements that holler "You may already be a winner!"
  • Press-releases have gargantuan lists of empty buzzwords attached to them.

Meta-utopia is a world of reliable metadata. When poisoning the well confers benefits to the poisoners, the meta-waters get awfully toxic in short order.

The other six reasons are equally caustic, and all have a common theme: relying on users to create accurate metadata means you're betting on an optimistic view of human behavior. And we all know how well that works out.

Which brings me to the complete abandonment of the <meta> tag. Isn't it ironic that groups still advocate manually adding metadata to web pages? Who, exactly, is adding The Dublin Core Metadata Element Set to the <head> section of their web pages? Nobody, that's who.

Manual metadata may be suspect, but automated generation of metadata is practically the holy grail. Google's entire 450 zillion dollar market cap is predicated on one tiny, automatically generated piece of metadata on every web page they index: PageRank. Popularity rules the web. It's high school all over again: either you're popular and people link to you, or.. well, good luck on that whole prom thing.

But popularity has some limitations. For one thing, PageRank doesn't work on an intranet. Office documents are rarely HTML, rarely linked to each other, and you probably don't have a large enough sample set to do any fancy statistical analysis, either. That's why the Google Search Appliance not only actively indexes metadata in the <meta> tag, it requires metadata to return relevant results. It's right in the manual. Just try doing that with the capital-g Google.

Perhaps that's why Tim Bray steadfastly maintains that some form of metadata is necessary to improve search results.

One of the Web's distinguishing features is that there's a big gaping hole where the metadata ought to be. The Web has resources, identified by URI, and you can ask for "representations," which come with some metadata, but the metadata is about the representation, not the resource. Given a URI, the Web has no built-in way to ask questions about it, for example "What is this about?" or "When does it expire?" or "Is this suitable for children?" or "Is this good?"

I'm not an advocate of the utopian semantic web, mind you, but I sure would like something that can tell the difference between a Jaguar and a Jaguar instead of telling me which one is more popular.

Read more

Stay Gold, America

We are at an unprecedented point in American history, and I'm concerned we may lose sight of the American Dream.

By Jeff Atwood · · Comments

The Great Filter Comes For Us All

With a 13 billion year head start on evolution, why haven't any other forms of life in the universe contacted us by now? (Arrival is a fantastic movie. Watch it, but don't stop there - read the Story of Your Life novella it was based on

By Jeff Atwood · · Comments

I Fight For The Users

If you haven't been able to keep up with my blistering pace of one blog post per year, I don't blame you. There's a lot going on right now. It's a busy time. But let's pause and take a moment

By Jeff Atwood · · Comments

The 2030 Self-Driving Car Bet

It's my honor to announce that John Carmack and I have initiated a friendly bet of $10,000* to the 501(c)(3) charity of the winner’s choice: By January 1st, 2030, completely autonomous self-driving cars meeting SAE J3016 level 5 will be commercially available for passenger

By Jeff Atwood · · Comments