26 March 2009 | Author: R. Falconer SEO ConsultantHTML 5 - an SEO must-know

HTML 5 will change
SEO forever. You may be thinking that forever is a very long time, and that that's a pretty big claim to make - it is, however a fact. And SEOs should be thinking about the consequences now, if not sooner. HTML 5 is, as its name suggests, the 5th version of HyperText Markup Language, the code used to structure web pages.
Originally conceived by the
Web Hypertext Application Technology Working Group (WHATWG), HTML 5 has been the basis of a
W3C working group since 2007 and the first Working Draft of the specification on was released in January 2008.
HTML 5 will continue to be worked on for years to come but once the main browsers can use it, it won't be long before web sites and
search engines start.
Search engines have had patents for page segmentation for some time now and there has been some suggestion that they are already using segmentation techniques to a certain extent. Page segmentation basically involves search engines breaking down a page into its component parts and analysing these on an individual basis.
The benefits of doing this are numerous from a search engine's point of view. Page segmentation allows a better understanding of the make-up of a page, allowing them to be examined more efficiently and increasing understanding of the relevancy of a page. Segmenting a page means a
search engine can decide where to look for the main content, and spending less resources on menus, footers, header, advertising and other block elements. It would also allow search engines to rank multi-topic pages more accurately.
The goal for any search engine is to be as good, or better, than a human at deciding on the best quality, most relevant page. A person reading a web page knows, without consciously thinking about it, where the main content is, what part is navigation, advertising etc.
Search engines often struggle with this as the html tags holding each of them is non-semantic and not necessarily in any order.
HTML 5 appears to be going to change that. Currently, most content is wrapped in < div > or < span > tags regardless of what it is. There are new tags being introduced by HTML 5 with semantic meaning, such as < article > (for an independent piece of content eg. blog post or news article), < nav > (for navigation), < footer >,< header >, < audio >, < video > and even a < dialogue > element. < aside > can be used to indicate a piece of content removed slightly from the rest of the page in terms of relevance.
It's easy to see how letting sites mark individual blocks of their page with meaningful information would assist in the segmentation of a page. Search engines would be able to know instantly what is what and decide how to treat it.
Once enough pages use HTML 5, search engines will inevitably start to use it to improve results. Links and/or content within certain tags will be treated differently from those within others and the markup of a page will become far more important to SEO than it is currently. This won't be rolled out across the board overnight and is likely to creep in as slowly as HTML 5 itself - but it's hugely important for SEOs to keep on top of what's going on and ensuring their sites are being built with long term goals in mind.