The Google bots will thank you for not duplicating content on the InterWeb.
Introducing Roboto, a Rails Engine for providing environment specific robots.txt files. Use this to hide your staging and QA environments from search engines.
Google will penalize you if content is duplicated across multiple domain names. If you have a publicly accessible staging environment, this could be hurting your production environment's search engine rankings. Follow along with this tutorial to hide your staging environment. Note, Roboto only works with Rails 3.1 and higher.
First, remove the generated robots.txt in your Rails App:
#> rm public/robots.txt
Next, add roboto to your Gemfile:
Then, add roboto to your routes (config/routes.rb):
Rails.application.routes.draw do mount_roboto end
You can now specify environment specific robots.txt files in config/robots. So, in the case where we want to add a global disallow to our staging environment, we would create a file with the contents below at config/robots/staging.txt. If you haven't done so already, copy config/environments/production.rb to config/environments/staging.rb and configure your staging server to run in the staging environment (Passenger, Heroku, Unicorn).
User-Agent: * Disallow: /
You can also specify a fallback robots/default.txt for any environments you do not need to be explicit about.