Stay away from my search result pages [insert search engine name here] bot!!
December 27, 2011 Leave a comment
Now, this is a bit overkill, but, there are almost as many different ways a search crawler makes use of a robots.txt file as there are search engines (this may be highly over-exaggerated, but anyways…).
Now, one thing you probably do not want with your public facing site, is for the search engine to waste it’s time crawling your search pages. You don’t exactly want a high page rank for your site’s search results, do you?
What to do, what to do?
Well, if your search results pages happen to live under /search/pages/results.aspx, here is an example. This again is a bit overkill, but it should get the job done. Now the search engines can focus on what you want to be getting searched for – your content!
Some search bots allow for wildcards, some are case insensitive, some are case sensitive – hence the number of variations below. Add this into your robots.txt, and you should be good to go.
Disallow: /search/pages/results.aspx
Disallow: /Search/Pages/Results.aspx
Disallow: /Search/Pages/results.aspx
Disallow: /Search/pages/Results.aspx
Disallow: /search/Pages/Results.aspx
Disallow: /search/pages/Results.aspx
Disallow: /Search/pages/results.aspx
Disallow: /search/Pages/results.aspx
Disallow: /search/
Disallow: /Search/
sallow: /search/pages/
Disallow: /Search/Pages/
Disallow: /Search/pages/
Disallow: /search/Pages/
Disallow: /*Results.aspx
Disallow: /*results.aspx
Any additions? Please share them here in the comments!