Google Breaking Covenant with Webmasters? Effectiveness of Robots.txt in Doubt

Home » Internet Marketing Blog » Google Breaking Covenant with Webmasters? Effectiveness of Robots.txt in Doubt

GoogleMy friend and co-worker John Liu from The World is Meh, has just written a great blog entry about Google’s apparent indexing of webpages despite explicit prohibitions from the site’s robots.txt file. Take a look at his thoughts on the issue.

Is this just a glitch in the Google matrix or the end of the search engine’s gentlemen’s agreement with webmasters? I hope it’s the former, not the latter. I did some testing with my own robots.txt file in the Google Webmaster Tools Robots.txt Analyzer and I was assured by the analyzer that the restricted URLs in the file would not be spidered when the Googlebot visited my site. For the moment, it seems Google is crawling everything and sending pages “blocked” by robots.txt to its notorious “repeat the search with the omitted results included” section of search results.

Geez… Google mistake or not, you have to do better than this. There was a reason I asked you not to spider this file!

Google does not appear to be fully respecting robots.txt right now. I’ve encountered a few cases of this today – including Google’s own Blogger.com.

Checking Blogger’s robots.txt file shows a short list of disallowed URLs:

# robots.txt for http://www.blogger.com
 User-agent: *
Disallow: /profile-find.gD
Disallow: /comment.gD
Disallow: /email-post.g

However, a Google search using the query “site:blogger.com profile find” returns

Google search using query

As you can see, the first result returned is exactly the disallowed URL. Note that it is indexed, but is apparently not cached – there is no search listing snippet.

Although the page is not being cached, the fact that it is being indexed at all shows that Google is not fully respecting Robots.txt! This seems to be a recent development, and hopefully it is just a bug that will soon be patched up, as opposed to a change in Google’s behavior.

Thank You For Reading!

Share this post

Share on facebook
Facebook
Share on twitter
Twitter
Share on linkedin
LinkedIn

What to read next

Who we are

Digital Sapien Interactive Team

Digital Sapien Interactive Team

A team of writers who are dedicated to providing visitors to our blog with insightful information into the world of SEO and digital marketing.