Of Spaces, Underscores and Dashes
I try to avoid using spaces in filenames and URLs. They're great for human readability, but they're remarkably inconvenient in computer resource locators:
-
A filename with spaces has to be surrounded by quotes when referenced at the command line:
XCOPY "c:\test files\reference data.doc" d: XCOPY c:\test-files\reference-data.doc d:
-
Any spaces in URLs are converted to the encoded space character by the web browser:
http://domain.com/test%20files/reference%20data.html http://domain.com/test-files/reference-data.html
So it behooves us to use something other than a space in file and folder names. Historically, I've used underscore, but I recently discovered that the correct character to substitute for space is the dash. Why?
The short answer is, that's what Google expects:
If you use an underscore '_' character, then Google will combine the two words on either side into one word. So
bla.com/kw1_kw2.html
wouldn't show up by itself for kw1 or kw2. You'd have to search forkw1_kw2
as a query term to bring up that page.
The slightly longer answer is, the underscore is traditionally considered a word character by the w regex operator.
Here's RegexBuddy matching the w operator against multiple ASCII character sets:
As you can see, the dash is not matched, but underscore is. This_is_a_single_word
, but this-is-multiple-words
.
Like NutraSweet and Splenda, neither is really an acceptable substitute for a space, but we might as well follow the established convention instead of inventing our own. That's how we ended up with the backslash as a path separator.