NOINDEX Robots.txt Directive - unofficial or official?

by SEO & Affiliate Consultant
M. Thomson
The Robots Exclusion Protocol, more commonly known as robots.txt, is a file that instructs search engine spiders on what to crawl and what not to crawl within a website directory.

There have long been rumours that Google and other search engines have been able to support the NOINDEX directive within robots.txt, for example:

User-agent: *
NOINDEX: /example/

NOINDEX is "officially" only supported within Meta Robots or X-Robots Tags, but following our tests, bigmouthmedia can confirm that Googlebot will obey this directive when used within robots.txt. Yahoo! Slurp and MSNBot did not de-index the URL tested.

Google Employee SagarK has commented on its use within robots.txt, but insisted: "we [Google] are always experimenting with different things."

While Google may merely be testing this feature, many search engine optimisation experts are likely to be unaware its existence.

The Spanish version of Google Webmaster Help section (as translated using Google Tanslate) officially states:

"The index of Web pages Google also allows the use of "noindex" in a file robots.txt to prevent references to appear even without tracking URL links in our web search results."

This translation of the statement leads bigmouthmedia to believe that NOINDEX is more than unofficially (or "extraoficialmente") supported.

This intentional support may be simply to prevent the indexing of tracked URLs, e.g.:

http://www.bigmouthmedia.com / example.asp?aff=1234

However, there are many more ways that bigmouthmedia believes this directive could be used.

Preventing the indexing of tracked URLs would be valuable, especially considering that webmasters often block tracking URLs from being indexed via robots.txt in order to "prevent" duplicate content occurring. While this method is technically reliable, users do occasionally link to "tracked" URLs and if a disallowed rule is present in robots.txt, no link value will be attributed, leading to a poorer search result from Google. The canonical tag exists for a similar purpose, but can be more expensive to implement.

With guidance from Google Webmaster Help ES and EN being slightly different, bigmouthmedia believes it would be useful for Google to standardise its documentation or to comment on the support of this "new" directive. Once tests have been finalised, it seems likely that robotstxt.org, Yahoo! and Bing would be willing to upgrade support. Watch this space!
  • Print this page
  • Send this page to a friend
  • Digg
  • delicious
  • Reddit
  • Google
  • Twitter
  • Sphinn
  • StumbleUpon
  • YahooBuzz
  • Facebook
  • Mixx

MoreMore

LessLess

MoreMore

LessLess

MoreMore

LessLess
bigmouthmedia - just what you were searching for
© bigmouthmedia 2010