Do you need a Robots.txt file on your website? As with nearly anything, there are pros and cons to adding this page.
Some people feel it is absolutely necessary while others will point to the fact that what you are trying to do with robots.txt can easily be bypassed. Take a look at this information and then decide for yourself.
What Is A Robots.txt Page?
A robots.txt file is simply a way to give search engine robots information about what areas of your site should be accessed and which should not.
Imagine that Yahoo! or another big name has sent out its bots in search of certain websites. It is looking for www.thisisanexample.com.
Supposedly, the bot will first check to see if there is a robots.txt page to visit. The code on this page gives instructions to the bot. It is placed on your web server in the top-level directory (the same place your index file resides).
Keep in mind that robots can be programmed by the search engine to ignore this page. Malware bots and email harvesting programs are two examples of bots that will purposely not look for a robots.txt file.
Your Robots.txt may not be visible to visitors, but it is publicly available. Anyone can access the contents and determine what pages you are requesting bots not to visit.
Elements Of A Robots.txt File
Let’s take a look at the common elements on a robots page and what each line means to the bot. There are no real standards which apply, however, they all use two lines – User-agent and Disallow.
- User-agent: * – the section applies to every robot.
- User-agent: MalBot – excludes only this bot from visiting the pages you specify.
- Disallow: / – instructs the bot not to visit any of the website pages.
- Disallow: /tmp/ – means that the bot is not supposed to visit this particular directory.
- Disallow: / – excludes all pages from bot activity (a blank instead of the / means just the opposite).
You will need a separate line for each file you specifically want to keep bots from. Do not use any blank lines between.
Basically, this is it. You will use any combination of the above two lines to determine which bots you are targeting and which directories or files are allowed access.
Where To Get A Free Download For Your Site
There are many sites where you can get a Robots.txt file created for your site for free if you are not comfortable writing the code yourself.
McCanerin has an easy to fill out form with a fairly comprehensive listing of search engines. Web Tools has one that is simpler, but without as many options and the SEOChat website also has an option. You can find numerous sites just by doing a search for “create robots.txt file”.
When you’ve created your file you can visit Google’s robots.txt analysis tool. It is located under webmasters’ help and requires that you have a Google Account set up.
Adding a robots.txt file to your website can be advantageous. There are many instances when you would not want a search engine robot indexing a particular page. By allowing access to only those pages that are optimised, your search engine ranking should climb higher.
Have a most outstanding day.
Sean RasmussenAussie Internet Marketing
www.SeanSEO.com © 2008 - 2012





{ 6 comments… read them below or add one }
Thisisi pretty much over my head. I will need a more simple explanation and my site checked
Hi Sean,
I’m with Gee. What is the point of blocking certain pages from being crawled. I see where you said that if you only allow access to pages that are optimized, your rankings should increase. Shouldn’t they all be optimized?
Or are you referring to the Contact, About Us pages etc?
It’s hard to go into detail there Jazz. In a nutshell, the best way to use a robots.txt is to block pages that you do not want to be ranked for and those that provide no SEO value to your website. There is more to it than that, but I will leave that up to your own research
As I am new to SEO is this what this article means….’By having control over which website pages the search engine robot can crawl will give you more control over your search engine ranking’ ?
Hi Sean,
I hope my understanding is correct, Sean. I see a benefit by using the robot.txt file in that, I can direct search engine spiders to bypass certain pages for indexing.
I may want this because I have duplicate content or weak content that could bring my page ranking down. There could be other reasons why I want a page to be private and not indexed for optimization by search engines. (?)
You are on the right track there Jill. Another example would be if you have a private members area of your site that you don’t want to be listed in search results.