Robots.txt isn’t something that’s often talked about in articles about SEO, but it is an important consideration. Most sites will have some files that they don’t want indexed for various reasons. The function of your robots.txt files, however, isn’t a simple matter of ‘allow’ and ‘disallow’.
Your site’s code plays an important role in the site’s search engine optimisation. Things like meta tags have an obvious benefit when optimised. The role of robots.txt is less obvious, but the way in which you compose this file can be the difference between being at the top of your ranking for keywords or failing to be indexed at all.
There are various ‘do’s and ‘don’ts when it comes to robots.txt files:
*Do: Examine the directories within your site in search of files you wish to disallow. Disallowing files is a handy way of avoiding duplicate content issues with the search engines.
*Don’t: Include all of your files within the disallow command. It may sound obvious, but even webmasters with a good grasp of code have been known to make this mistake. In fact, it is occasionally a good idea to inspect your robots.txt files for this kind of error if your site disappears from the index. Pranksters have been known to disallow all files deliberately.
*Do: Make use of robots.txt when you want to keep web and print versions of content within your site but have only one form indexed for duplicate content reasons.
*Don’t: Disallow the web version of your files. This is a small matter, but can affect your site greatly. Your web versions have been the focus of your SEO.
*Do: Be conscientious in the information you provide the search engine spiders with in your robots.txt file.
*Don’t: List every file of your site in the robots.txt file. There is such a thing as too much information. Remember that your robots.txt files are viewable to anyone who cares to look. There may be files you don’t want competitors or the average user to see from your site.
*Do: Keep sensitive information out of the path of search engine spiders using your robots.txt files. Things like phone numbers and email addresses are information you might think twice about setting free on the search engine results pages.
*Don’t: List extremely sensitive information within the robots.txt files of your site. Again, robots.txt files keep some information private from the search engines, but they are public files.
*Do: Use robots.txt to ensure a better interaction between your site and the search engines.
*Don’t: Mess with your robots.txt without proper knowledge of what you are doing.
As you can see, robots.txt is a good method of avoiding duplicate content issues, although it’s not the only way. Robots.txt files should not be used to get around the search engine rules. Above all, remember that these files are publicly available. It is a good idea to consult an expert when using robots.txt files as part of your SEO strategy. Seek the advice of our experts at SEO Consult for further information.
Related posts:
- How robots.txt Can Affect Search Engine Optimisation Strategy
The search engine optimisation of a site involves a lot of different things. Some of these are very technical aspects, while others of them are... - Tell the Search Engines what you want them to see
The key to a successful and focused SEO campaign is undoubtedly making the search engines aware of what you don’t want them to crawl on... - Duplicate content
From an SEO and a Search Engine point of view, content is king and duplicate content is very, very bad. Just like in the real... - SEO: Don’t Duplicate, Innovate
Google considers duplicate content to be “substantive” blocks of text within a single domain or across multiple domains that are either completely identical or are... - SEO For On-Page Objects
These days, most web pages feature objects of one kind or another. It could be a PDF, a video, or even a basic image. Objects...
Tags: crawlers, exclusion, indexing, robots.txt, spiders
Link to us
If you want to link to this blog, copy and paste the following HTML code to your website.

(2 votes, average: 4.50 out of 5)






