A little while ago, my hoster (1and1.com) whom I actually like quite a bit, decided that I was using too much of my fair share of the common server and decided to warn me that I was “abusing” the system. They told me that if I didn’t reduce my server load, I would have to move up to a (rather costly) dedicated server.
After an initial panic attack, I finally took a long overdue look at caching and search engine crawlers. Here’s what I found in that process:
- As it turns out, it doesn’t make sense to let search engine crawlers index content that either shouldn’t be indexed or that does not add much to search engine placement. Large-size content (media files or large PDFs) as well as possibly images are good examples. Preventing indexing of these can easily be accomplished by using a well-crafted robots.txt file (the file that tells search engines how to handle your site’s content). I found a good write-up of this approach on Perishable Press.
- It’s important to not just keep an eye on Google Analytics stats but also on the raw http (get/post) stats – especially when the hosting company frowns on heavy use in a shared server environment.
I tried out the following robots.txt file on my server (that actually hosts a few other sites, too) and saw the drop in server requests that is shown in the image on the top of this post. Pretty impressive, isn’t it? And as you can see, the improvement happened overnight!
User-agent: * Disallow: /feed/ Disallow: /trackback/ Disallow: /wp-admin/ Disallow: /wp-content/plugins/ Disallow: /wp-content/themes/ Disallow: /wp-includes/ Disallow: /xmlrpc.php Disallow: /wp- Disallow: /otherprivatefolder/ Allow: /downloads/