SEO Technology

Originally, the first commercial search engines were directories, such as Yahoo! and Galaxy, and as such, site technology of the sites in their index was (by and large) not a real issue, except for aesthetic and site quality considerations. However, with the introduction of the major early spider based search properties, such as Lycos, AltaVista and Inktomi, the ability of "robots" to investigate websites became a major consideration. Robots, which are colloquially known as spiders, are pieces of software used by some search engines to investigate the content of websites, and then present their findings to the search engine database. Search results are then ranked according to an algorithm that attaches certain priorities to aspects of the database, and orders the sites in the search engine results pages as an effect of this.

However, there are many aspects of site coding that may present barriers to search engine robots. Many of the robots were programmed at around the time of the first few search engines, and so their reference to HTML is in many ways stuck in the mid-late 1990's. Many have difficulty parsing HTML that is taken for granted by webmasters, such as framesets, embedded tables, image links and maps, and JavaScript/dHTML. Although some robots have evolved well, Googlebot (the robot used by Google) being a notable example of a robot that moves well with the times, there are as many that have not really changed in the six years or so that they have been in existence. This means that an SEO company must have a full understanding of how intricate site coding may present barriers to search engine robots, and also of how site coding may present opportunities to improve rank by simple technical changes to the site.

However, it should be remembered that site coding should not be abused to artificially inflate rank, as again this will be considered spam by search engines, and may cause the site to be penalised or barred by those search engines.

One complex, and fairly crafty way of fooling search engines is by using a technique that is commonly called cloaking. This technique involves recognising site visitors by their user agent (browser or robot name) or by their IP address. This allows you to present pages specifically optimised for specific search engines, meaning that each engine can be optimised for individually. In principle this sounds like quite a good idea, but it is obviously open for abuse, and has been banned by many search engines as a result. Google, for example, takes the line that what its spider indexes, must be what the users of your site will see. IP and user agent delivery has been used in the past to fool robots into thinking they were indexing popular sites such as Hotmail or Microsoft, where in fact they were indexing pornographic sites. This is obviously not in the interests of search engines, and it is easy to see their point of view.


Robots Exclusion





This article was first published on 03 June 2002 and does not necessarily match current events or the current opinions and views of bigmouthmedia ltd.
Home | Careers | RSS | Contact Us | Newsletter
International sites:
bigmouthmediaPromotAltContact Us SEO Social Media Affiliates Analytics Display Usability PPC