Canonicalization: Not Just for Popes

You may remember the ASP.NET canonicalization vulnerability from last year. And what exactly is canonicalization? From Microsoft’s Design Guidelines for Secure Web Applications:

Data in canonical form is in its most standard or simplest form. Canonicalization is the process of converting data to its canonical form. File paths and URLs are particularly prone to canonicalization issues and many well-known exploits are a direct result of canonicalization bugs. For example, consider the following string that contains a file and path in its canonical form.

c:\temp\somefile.dat

The following strings could also represent the same file.

somefile.dat
c:\temp\subdir\..\somefile.dat
c:\temp\somefile.dat\ ..somefile.dat
c%3A%5Ctemp%5Csubdir%5C%2E%2E%5Csomefile.dat


In the last example, characters have been specified in hexadecimal form:


• %3A is the colon character.
• %5C is the backslash character.
• %2E is the dot character.

You should generally try to avoid designing applications that accept input file names from the user to avoid canonicalization issues. Consider alternative designs instead. For example, let the application determine the file name for the user. If you do need to accept input file names, make sure they are strictly formed before making security decisions such as granting or denying access to the specified file.

Seems straightforward enough; there can be only one true representation of the data, just like there’s only one Pope. And popes don’t canonicalize: they canonize. Which means the words “canonicalize” and “canonicalization” are artificially fabricated technical mumbo-jumbo. As if we didn’t have enough of that to go around already:

We are asking for your help in eradicating words that have been invented for no good reason. Sometimes, it’s too late to do anything about them. Look at the word “canonicalize,” for instance. It is used to mean “to create the canonical form” of something, like a URL (as in InternetCanonicalizeUrl from the WinINet API). It’s not English; it was invented because someone didn’t know that there was already a perfectly adequate word for this process: “canonize.” However, once this non-word has been created, the rules of the language suddenly apply again, so the process of “canonicalizing” something is “canonicalization” instead of “canonization.”

More recently, we’ve seen the word “performant” start its crawl into the everyday vocabulary of devspace. It is used to mean “highly performing.” It’s also not a word. When something provides information, it’s informative. It’s not “informant.” The word “performant,” if it existed, would be a noun – not an adjective. But it doesn’t exist, so if you do see it in print, remember that it’s not really there.

Any readers who have made it this far are probably rolling their eyes now, thinking to themselves, “Why are they being such sticklers here? Isn’t the language a wonderful, evolving thing?” Yes, our language is evolving. As there is a need for new words, new words enter the language. But making up new words is just as bad as using fancy words in place of short ones. Why say “This project’s goals are orthogonal to the company’s needs?” Admit it – if you were at home, you’d just say “different from” or “at odds with.”

It’s one thing to use technical jargon excessively, but the perpetuation of new jargon for jargon’s sake is particularly Orwellian. Along those same lines, you may also be interested in Cyrus’ list of commitments.

  1. reinvent value-added markets
  2. brand e-business technologies
  3. benchmark value-added content
  4. optimize one-to-many infrastructures
  5. enable innovative niches
  6. integrate real-time mindshare
  7. aggregate collaborative content
  8. repurpose transparent platforms
  9. reinvent visionary solutions
  10. visualize end-to-end initiatives

Is it clear? As an unmuddied lake, sir. As clear as an azure sky of deepest summer.

Related posts

There is no longer any such thing as Computer Security

There is no longer any such thing as Computer Security

Remember “cybersecurity”? Mysterious hooded computer guys doing mysterious hooded computer guy... things! Who knows what kind of naughty digital mischief they might be up to? Unfortunately, we now live in a world where this kind of digital mischief is literally rewriting the world’s history. For proof of that, you

By Jeff Atwood ·
Comments
Hacker, Hack Thyself

Hacker, Hack Thyself

We’ve read so many sad stories about communities that were fatally compromised or destroyed due to security exploits. We took that lesson to heart when we founded the Discourse project; we endeavor to build open source software that is secure and safe for communities by default, even if there

By Jeff Atwood ·
Comments
Let’s Encrypt Everything

Let’s Encrypt Everything

I’ll admit I was late to the HTTPS party. But post Snowden, and particularly after the result of the last election here in the US, it’s clear that everything on the web should be encrypted by default. Why? 1. You have an unalienable right to privacy, both in

By Jeff Atwood ·
Comments

Welcome to The Internet of Compromised Things

This post is a bit of a public service announcement, so I'll get right to the point: > Every time you use WiFi, ask yourself: could I be connecting to the Internet through a compromised router with malware? It's becoming more and more common to see

By Jeff Atwood ·
Comments

Recent Posts

Stay Gold, America

Stay Gold, America

We are at an unprecedented point in American history, and I'm concerned we may lose sight of the American Dream.

By Jeff Atwood ·
Comments
The Great Filter Comes For Us All

The Great Filter Comes For Us All

With a 13 billion year head start on evolution, why haven’t any other forms of life in the universe contacted us by now? (Arrival is a fantastic movie. Watch it, but don’t stop there – read the Story of Your Life novella it was based on for so much

By Jeff Atwood ·
Comments
I Fight For The Users

I Fight For The Users

If you haven’t been able to keep up with my blistering pace of one blog post per year, I don’t blame you. There’s a lot going on right now. It’s a busy time. But let’s pause and take a moment to celebrate that Elon Musk

By Jeff Atwood ·
Comments
The 2030 Self-Driving Car Bet

The 2030 Self-Driving Car Bet

It’s my honor to announce that John Carmack and I have initiated a friendly bet of $10,000* to the 501(c)(3) charity of the winner’s choice: By January 1st, 2030, completely autonomous self-driving cars meeting SAE J3016 level 5 will be commercially available for passenger use

By Jeff Atwood ·
Comments