Unbreakable Links Revisited

Philipp Lenssen pointed out that my concept of Unbreakable Links is, unsurprisingly, not a new one. It's also known as

All of these terms really refer to the same thing: using a search engine to build an unique URL. However, there are some not-so-obvious problems you'll encounter when building links this way. To work around the problems, the Robust Hyperlinks paper proposes using a combination of techniques:

  1. A Unique Identifier (UID) is a name unique within the document, as per ID attributes in SGML/XML. These survive the most violent document changes, except its own deletion.

  2. A Tree Walk describes the path from the root of the document, through internal structural nodes, to a point within media content at a leaf.

    In practice, tree walks are the central component of robust locations. Since tree walks incrementally refine the structural position in the document as the walk proceeds from root to leaf, they are robust to deletions of content that defeat unique ID and context locations. Thus, tree walks are especially helpful for documents such as those that transclude dynamic content, as with stock quotes, where the content itself changes while the structural position remains constant.

    We describe tree walks with a sequence of node child numbers and associated node tags (generic identifiers), terminating with an offset into a media element. This is both a simpler, less expressive, and more redundant, representation than is allowed by XPointer. For example, consider the following tree walk into a particular HTML document:

    21/Professor/8 0/ 0/ADDRESS 1/H3 0/BODY 0/HTML

  3. Context is a small amount of previous and following information from the document tree. We propose a context record containing a sequence of document content prior to the location, and a sequence of document content following the location. For example, for the location described by the tree walk above, let us suppose the word "Professor" is found in a sentence fragment that reads "congratulations on her promotion to Professor in the Computer Science Division". The context descriptor could be:

    her+promotion+to+Professo r+in+the+Computer+Science

They also propose appending this information to the URL in a querystring-- so you have both an absolute link and a relative fallback:

Given that lexical signatures are a good way to augment URLs, we are left with the issue of how to associate these with hyperlinks. Our primary requirement is that the solution fit into the existing Web infrastructure moderately well. Our proposal is to append a signature to a URL as if it were a query term. That is, if the URL is http://www.something.com/a/b/c, and the designated resource has the signature w1,...,w5, then the robust URL is

http://www.something.com/a/b/c?lexical-signature="w1+w2+w3+w4+w5"

I do think, at some point in the future, all links will be constructed this way. The existing absolute link system breaks down over time, and I think it's fairly obvious by now that absolute keyword search is the most effective navigation metaphor for the web. My apologies to Yet Another Hierarchically Organized Oracle, but that style of tree-based directory navigation was always driven by the lack of a competent search engine, not actual choice.

Try building your own unbreakable link with The Incredible LinkTron 5000(tm)!

Read more

Stay Gold, America

We are at an unprecedented point in American history, and I'm concerned we may lose sight of the American Dream.

By Jeff Atwood · · Comments

The Great Filter Comes For Us All

With a 13 billion year head start on evolution, why haven't any other forms of life in the universe contacted us by now? (Arrival is a fantastic movie. Watch it, but don't stop there - read the Story of Your Life novella it was based on

By Jeff Atwood · · Comments

I Fight For The Users

If you haven't been able to keep up with my blistering pace of one blog post per year, I don't blame you. There's a lot going on right now. It's a busy time. But let's pause and take a moment

By Jeff Atwood · · Comments

The 2030 Self-Driving Car Bet

It's my honor to announce that John Carmack and I have initiated a friendly bet of $10,000* to the 501(c)(3) charity of the winner’s choice: By January 1st, 2030, completely autonomous self-driving cars meeting SAE J3016 level 5 will be commercially available for passenger

By Jeff Atwood · · Comments