SEO and Robot commands….can they live peacefully together?
Along the road whilst working in SEO you will find some snags that cause issues with the search engine results. These can include duplicate content problems and websites dropping from the search results which can be the ultimate frustration. Perhaps you want to restrict certain areas of your site due to design or SEO related problems that could cause harm to your listings. In this case you need robot commands.
The most frequently used is the Robots.txt file which, although acts as a simple tag to use for requesting that a page is either taken into account or ignored, it does not provide a guarantee that that particular page will not be indexed. The reason its used so frequently is that it is a simple addition to a site and allows clear instruction of what pages should be accepted, with you able to list the relevant page addresses/folders. However, you have no guarantees and could use a different instruction to ensure you get what you want.
If you are looking to not just simply block a page but actually effectively remove it as far as Google is concerned, then you can use the Meta tag “noindex” and this will result in that page being ignored from the indexing process. Of course there is probably a good reason that you are wanting to block particular pages from the search engines but you must consider that any links you have on those pages will be not be adding any benefit to your SEO and could potentially have a negative effect on your campaign. So watch out when blocking or removing certain pages.
There are other commands you can use to be more specific when choosing what you want accessed on a site, below are just some Meta tags that you can use for various purposes.
- Noindex – As discussed earlier, this will instruct Google to completely discount the page in question when indexing.
- Nosnippet – This prevents the description appearing under the link in the SERPS
- Noarchive – This stops a cached version of the page being accessible in the SERPS
- Nofollow – A common command which prevents Google from following the links on a page. This is a big no-no for consultants when building links as you want Google to take those high quality links into account when indexing the site.
Duplicate Content and the Canonical tag
One of the most common reasons people block pages is due to duplicate content and unfortunately only one of the URLs with that content on can be displayed by Google. The question is which URL should it use for the search results? If left to it`s own devices Google may choose the wrong Url from an SEO point of view and to combat this you can use a canonical tag to specify which is the preferred version on which page of that duplicate content. The tag would look like this with the appropriate link to the page you want to specify to take priority:
This enables Google to see that although there is duplicate content issues, that you are telling it which page should be the one to be indexed.
You must ensure you are careful when implementing robot commands, as the wrong tag on the wrong page could potentially cause bigger problems for your SEO but overall they are a simple and quick tool to use and can give Google more direction and help you instruct more directly what you want.

March 28th, 2011
Great post, thank you