Duplicate Content has long been an issue with webmasters and is cause for concern because duplicating content dilutes your ranking in search results. Perhaps the most common cause of having duplicate content is in allowing both www. and non www versions of your site to exist. However, another place where I see duplicate content occur is when a customer has multiple domain names pointing to the same content. When you use multiple domain names, and there are good reasons to do this, you should pick a preferred URL and ensure that any secondary domains are 301 redirected to that URL.
In this video, Google’s Greg Grothaus does a great job explaining duplicate content problems and solutions ranging from 301 redirects to canonical tags.
I have previously blogged about how 301 redirects can prevent common problems from diluting the search engine ranking for your site. However, there are however a number of situations that create duplicate content concerns that are more difficult address. Webmasters now have additional tools to deal with duplicate content through Google’s Webmaster Tools panel. Through this addition to webmaster tools, Google Lets You Tell Them Which URL Parameters To Ignore. This will a great help, especially for many large dynamic sites.
I mention in my earlier post that you should choose to use www or not , but regardless, stick with it. Another issue that arises is that you can’t always control how someone else will link to your site. While most will include www. others wil not. To insure that all you links are cataloged in the search engines uniformly, you should implement a couple of rules in your .htaccess file.
A rule to insure www is (or is not) always used. To Redirect mysite.com to www.mysite.com Options +FollowSymLinks RewriteEngine On RewriteCond %{HTTP_HOST} ^mysite\.com$ [NC] RewriteRule ^(.*)$ http://www.mysite.com/$1 [R=301,L]
To Redirect www.mysite.com to mysite.com Options +FollowSymLinks RewriteEngine On RewriteCond %{HTTP_HOST} ^www.domain\.com$ [NC] RewriteRule ^(.*)$ http://domain.com/$1 [R=301,L]
A rule to always redirect your home page (index.html, index.php, default.htm, etc) to just root (http://www.yourwebsite.com/) REDIRECT 301 "index.htm" http://www.somewebsite.com
While it won’t apply to most of the people reading this blog, it is also important not to have multiple views of the same information. So if you have a site with a page of toasters and you give the browser the option to sort by price, color, etc, be sure that the spiders only see one sort of the information.