Duplicate content within a website can have a negative impact on that site’s ability to generate traffic from search engines. Here’s why: if there are two different URLs with the same content on a page, the search engines get confused as to which page should actually rank. In more nefarious situations, sites deliberately copy content in an attempt to game the algorithms. Doing so can invoke a negative response from the search engines. The simple summary: keep duplicate content off your site. Unfortunately, some website platforms (including – and perhaps especially – WordPress), make it all too easy to unintentionally generate duplicative content.
Finding Copied Pages
In many cases, you can find pages on a site that have been unintentionally copied through reports in Google Webmaster Tools. In this example we’ll use the Duplicate Title tags report – and the 65 examples listed – to demonstrate how to find duplicate pages. You can access this report in Google Webmaster Tools under Search Appearance > HTML Improvements.
Click on the results listed in the Duplicate Title Tags report to dig deeper and identify issues with your site that are copying content.
In this case, reviewing some of the duplicate title tags reported by Google Webmaster Tools shows a pattern where pages seem to be duplicated by truncating the URL. In the examples below – each of the URL pairs takes users to the exact same content.
We’ve also found duplicate content issues caused by some technical problems around both pagination, URL parameters and RSS feeds. Each of the different “pages” below returns the exact some content as the homepage – a very bad signal to the search engines. (For the purposes of focusing this article – knowing what pagination, parameters and RSS feeds are isn’t important – knowing that they’ve caused a problem is.)
I won’t go into answering why these problems happened, or how to fix them – but what we’ve diagnosed here are two technical problems that resulted in a pattern of publishing multiple pages with exactly the same content on the site. Not what the search engines want to see.
Duplicate (kind of) Pages
Finding duplicate H1’s can also identify pages that were created carelessly and contain almost exact duplication of content. In the example below – two different pages exist with the H1 “How to stop a runaway Toyota”.
The content on the pages is almost exactly identical – the small nuanced differences look like one was edited after the first one was published. Note the two different URL’s (my bolds): www.oklahomalawyer.com/how-to-stop-a-runaway-car/ and www.oklahomalawyer.com/how-to-stop-a-runaway-toyota/ delivering the exact same content:
(Let’s leave the rationale for trying to optimize for “runaway Toyota” as a mystery.)
Unique Content with Duplicate Title Tags
In some cases, attorneys develop genuinely unique content and simply use the exact same language in their on page titles.
In the example below the URL’s (and exact verbatim Titles) were written months apart and have entirely different content, yet both share the exact same H1 Tag and Title Tag. (Assuredly the writer was trying to grab traffic interested in this topic – but having the same exact H1 and Title tag on the same website serves only to confuse the search engines.)
In these examples, there are a variety of reasons why the site has unwittingly published the same content on multiple URLs. The solutions are technical and require both systematic and one-off fixes . . . but it is very easy to use the Google Webmaster Tools report to ID those problems.