Coding Horror

programming and human factors

If It's Not in Google, Does Your Website Really Exist?

Rich Skrenta, who may have written the first microcomputer virus, calls Google the start page for the Internet:

The net isn't a directed graph. It's not a tree. It's a single point labeled G connected to 10 billion destination pages.

If the Internet were a monolithic product, say the work of some alternate-future AT&T that hadn't been broken up, then you'd turn it on and it would have a start page. From there you'd be able to reach all of the destination services, however many there were.

Well, that's how the net has organized itself after all.

From this position, Google derives immense and amazing power. And they make money, but not only for themselves. Google makes advertisers money. Google makes publishers money. Google drives multi-billion dollar industries profiting from Google SEM/SEO.

Most businesses on the net get 70% of their traffic from Google. These business are not competitors with Google, they are its partners, and have an interest in driving Google's success. Google has made partners of us all.

But what happens when the start page for the internet-- the source of more than 70 percent of your traffic-- decides it will no longer index your web site?

Google blur

That's exactly what happened to JavaLobby over the christmas break:

It had been aggravating to spend holiday time cleaning up the unwanted [50,000 spam forum messages], but the real problem didn't surface until we started going through our normal morning routine yesterday, having just returned to work from our holiday break. We generally take a look at a variety of statistics in the morning before proceeding into whatever development work we're doing. Having been out of the office for almost two weeks, we had a lot of stats to look at. It took no time to see that something was wrong - traffic was down. A little more investigation revealed the problem.

We had completely disappeared from Google's main index! If you run a website, then you know how serious a problem this is. On any given day over 10,000 visitors arrive at Javalobby as a result of Google searches, and suddenly they stopped coming! We had apparently been grouped together with the spammer's viagra and casino sites, and poof! Suddenly we no longer existed in the eyes of Google, the world's largest search engine. Countless thousands of well-ranked pages gone in a blink. Perhaps you now understand why I would commit a violent crime if I caught those forum spammers? In essence, they have wiped out strategic positioning that we took years to build.

Google's response in this situation is arguably justified. They can't have query results redirecting users to sex sites; the public good requires that rogue or defaced websites get removed from their index as soon as they're discovered. Google's Matt Cutts, in response to a similar Google delisting incident posted on Slashdot, wrote an entire blog post documenting exactly how Google handles hacked websites:

But let's take a step back. This site was hacked and stuffed with a bunch of hidden spammy porn words and links. Google detected the spam in less than 10 days; that's faster than the site owner noticed it. We temporarily removed the site from our index so that users wouldn't get the spammy porn back in response to queries. We made it possible for the webmaster to verify that their site was penalized. Then we emailed the site, with the exact page and the exact text that was causing problems. We provided a link to the correct place for the site owner to request reinclusion. We also made the penalty for a relatively short time (60 days), so that if the webmaster fixed the issue but didn't contact Google, they would still be fine after a few weeks.

Ultimately, each site owner is responsible for making sure that their site isn't spammy. If you pick a bad search engine optimizer (SEO) and they make a ton of spammy doorway pages on your domain, Google still needs to take action. Hacked sites are no different: lots of spammy/hacked sites will try to install malware on users' computers. If your site is hacked and turns spammy, Google may need to remove your site, but we will also try to alert you via our webmaster console and even by emailing you to let you know what happened. To the best of my knowledge, no other search engine confirms any penalties to sites, nor do they email site owners.

I had completely forgotten about Google's Webmaster console until Matt mentioned it. If you own a website, you should take advantage of these tools. They'll let you diagnose and fix most Google-related problems. On top of that, they'll even give you some basic stats on the search queries people used to get to your website. All you have to do is prove ownership of your website by either uploading a specially-named file, or modifying a page to include a specific META tag.

But let's put aside, for a moment, the fact that the webmaster response to Google's delisting was a little hysterical in the face of Google's excellent webmaster tools. Can you blame them? You'd probably be upset, too, if more than 70 percent of the website to your traffic disappeared overnight.

That's the truly scary part. Google's lead over its competitors is so complete, so total, that if your website isn't in Google, it effectively doesn't exist. I'm not sure the Microsoft monopoly has ever wielded that kind of power. And even if they did, it would by definition be limited to desktops. Google has shown few signs of abusing their position so far. But I'm not sure I'm comfortable with a single company having such near-absolute power over the sum of all information on the internet, either.

Written by Jeff Atwood

Indoor enthusiast. Co-founder of Stack Overflow and Discourse. Disclaimer: I have no idea what I'm talking about. Find me here: https://infosec.exchange/@codinghorror