SEO News: Google Highlights Robots.txt Filesize Limit
If you’ve been involved in the SEO industry for very long at all, you’ll undoubtedly be aware of the importance of robots.txt files in certain circumstances.
Robots.txt files can be very useful in determining how your site is viewed by Google and the other search engines out there. Not only can you forbid the search engines from seeing (and therefore indexing) your site at all; you can also hide specific pages from their view.
For bigger sites, such as ecommerce platforms with hundreds and hundreds of pages, this can mean configuring quite a sizeable robots.txt file. And this is exactly where the problem can occur. Via his Google + page, Googler John Mueller recently reminded SEOs that, ideally, robot.txt files should be no larger than 500kb in size.
According to JM, Google’s web spider, or Googlebot for short will only read the first 500kb of any robots file. This means that if you have important information in your file, but it’s larger than 500kb then you could be leading Google to disregard said information. This could lead to serious problems and could even mean that your robots.txt file doesn’t succeed in performing the task that you originally configured it for – potentially causing big problems with your SEO campaign.
More than this, an unnecessarily large robots file could be exactly that; unnecessary and could suggest that the entire build and coding of your site might need a rethink. With Google and SEO in mind, remember to check your robots file after any work is done on the site, and remember that the simpler your site is to navigate, the better chance Googlebot has of doing so successfully.
