Google Not Fully Respecting Robots.txt File?
Google does not appear to be fully respecting robots.txt right now. I’ve encountered a few cases of this today – including Google’s own Blogger.com.
Checking Blogger’s robots.txt file shows a short list of disallowed URLs:
# robots.txt for http://www.blogger.com User-agent: *Disallow: /profile-find.gDisallow: /comment.gDisallow: /email-post.gHowever, a Google search using the query “site:blogger.com profile find” returns
As you can see, the first result returned is exactly the disallowed URL. Note that it is indexed, but is apparently not cached – there is no search listing snippet.
Although the page is not being cached, the fact that it is being indexed at all shows that Google is not fully respecting Robots.txt! This seems to be a recent development, and hopefully it is just a bug that will soon be patched up, as opposed to a change in Google’s behavior.
FREE Video Training For Internet Marketers

Get insider tips on how to make more money from your internet marketing efforts as well as how to bring thousands of new visitors to your website. Here’s what you get:
- Ultimate Entrepreneur eCourse
- The Online Profit Training
- SEO Traffic Building Coaching Call (pre-recorded)
- Blogging for Newbies eCourse
- Internet Marketing News Center
Related Posts
- Google Breaking Covenant with Webmasters? Effectiveness of Robots.txt in Doubt
- Google SiteLinks Gives Your Website More Entry Points | Digitalsapien.com
- SEO Infographic – Overdrive Interactive Launces Search Marketing Map | Digitalsapien.com
Related Websites
- Getting Your Site Indexed In Google in Under 24 Hours (imsuccesscenter)
- Affiliate marketing & social media marketing – Do they get together? (Chad Nicely)



