Google and WordPress Search Results Page

SEO Tutorials | December 25th, 2007 | 7 Comments

It has been awhile that I did notice that Google search engines would index search page results of WordPress blogs such as this one for example and you can see it indexed here (while writing this post).

This type of indexing I have seen on ALL wordpress blogs I own and of clients for whom I coded a wordpress layout. At first point my concern was that there was something wrong with the code of my layout, spent hours and hours examining the code but did not found a reasonable explanation of why Google search engine would index the search results page of WordPress blogs. The fact that several pages are being indexed excluded the fact that maybe someone linked to the search results pages of my wordpress blog.

I did forget for awhile regard that issue, until today when Nelson of Help Desk Geek pointed out again that same issue. So I figured instead of wasting hours and hours investigating again on the code and trying to come up with a reasonable explanation it would be smarter to come up with a hack and use noindex robots meta tag in the head of the blog for the search result pages of the wordpress blog. Here is how the hack looks like.

<?php if(is_search()) { ?><meta name=”robots” content=”noindex” /><?php } ?>

The above code placed in your header.php file between the <head> and </head> will create a noindex robots meta tag on the wordpress search results page, you can see it working live here at this blog, check on any page and you will not find that meta tag anywhere, but if you perform a search and once the search results page loads check the source code again and you will see the noindex robots meta tag.

I did notice that this was happening with other blogs as well and not only mine, which means I did no mistake on the coding process (pheww), you can notice the same has happened to John Chow where you can see the indexed pages and on many other blogs that use WordPress blogging platform.

Now, if you don’t feel comfortable with adding that robots head meta tag there is another solution, by using the robots.txt file to block all the search engine spider bots to crawl and index pages of your blog that have a URL structure similar to http://www.yoursite.com/?s and all the variations (including those from internal pages linking with a structure like ?s). To block the spiders just write down this line in your robots.txt file

User-agent: *
Disallow: /*?

The line above would block ALL search engine spiders to crawl and index the WordPress search results page, and if you want to be more specific define the User-agent: (for example, for Google’s spider define the user agent: Googlebot.

If that was bothering you, than there is the solution. If you liked this post on SEO Optimization blog than subscribe to my RSS feed by email and receive the updates by email.

Article Marketing Robot

7 Responses to “Google and WordPress Search Results Page”
  1. Mike Huang says:

    Interesting post, I’ll keep this in mind.

    -Mike

  2. I’m going ot try it on one of my blogs. Thanks for sharing this info.

  3. […] Google and WordPress Search Results […]

  4. smoMashup says:

    Thanks for this great post. I’m using the WP meta that you suggested and it’s working fine on my site. I would like to say for people coming here that are going to try this, since the code posted above is in a ‘blockquote’ it’s not copying and pasting as pure text for whatever reason.

    Here it is in a ‘code’ wrap which should work:

    Thanks again!

  5. smoMashup says:

    the code tags didn’t work as advertised. sorry 🙁

  6. Hi smoMashup,
    Didn’t worked the meta tag code on your site, or it didn’t worked the example you tried to wrap it in the code?

  7. NoahsDad says:

    Google openly stated a while back that they would be applying data to forms in an attempt to index the deeper (hidden) web.

    I imagine Google knows what words are relevant to a particular site and fires this list through the WordPress search box hence generating index.php?babies, index.php?graco etc.

    Personally I would rather they didn’t as I prefer to have control over landing pages etc. Your method of blocking works fine at removing this issue.