There may be areas of your site that you do not want to appear in search engine listings, such as customer information, or other sensitive data. Indeed, there may be reasons why you do not want your site to appear at all in some or all search engines. If this is the case, the way to avoid being listed inadvertently is to employ a Robots Exclusion Protocol (REP). This is a standard text file that contains instructions to robots about whether you will allow them to index parts of your site or not, and it is the first thing a robot will check when it arrives at your site.
It is in the interests of the robot not to waste its time indexing your site if there is no need, and also not to index everything on the site if there are large areas that are not relevant, and so nearly all robots are happy to follow the instructions in the REP. This means that the robots do not waste their bandwidth, nor do they waste yours.
The instructions in a REP are written in a manner resembling a coding language, although in reality they are just variables collected by the robot, and translated into instructions by the robot. This means that the instructions have to be written in a specific format in order to be recognised by the robot.
This article was first published on 04 June 2002 and does not necessarily match current events or the current opinions and views of bigmouthmedia ltd.













