Specifying Canonicals Using rel="canonical" and Other Methods

How to Specify a Canonical URL with rel="canonical" and Other Methods

When you have duplicate or very similar pages on your website, it can be confusing for search engines to determine which page to prioritize in search results. To address this, you can specify a canonical URL, indicating your preferred page to Google Search.

There are several methods to do this, each with varying levels of influence:

Strong Signals:

  1. Redirects: The strongest signal, indicating that the redirect target URL should be the canonical.

  2. rel="canonical" link annotations: A strong signal that explicitly tells search engines the preferred canonical URL.

Weak Signal:

  1. Sitemap Inclusion: A weaker signal, subtly suggesting that URLs within your sitemap are preferred.

These methods can be combined for a stronger impact. For example, using both a redirect and a rel="canonical" link increases the likelihood of your chosen URL appearing in search results.

While specifying a canonical URL is recommended, it's not mandatory. If omitted, Google's algorithms will analyze the content and determine the objectively "best" version for users.

Content Management Systems (CMS):

If your website runs on a CMS like WordPress, Wix, or Blogger, directly editing HTML might be restricted. Look for settings or plugins within your CMS specifically designed to manage canonical URLs. For instance, search for "WordPress set the canonical element."

Why Specify a Canonical URL?

While not always critical, specifying a canonical URL offers several benefits:

1. Control Over Search Result Display:

You can guide users to your preferred version of a page. For example, you might prioritize:

https://www.example.com/products/shoes/hiking-boots.html

over a URL with tracking parameters:

https://example.com/products?category=footwear&product=boots&utm_source=email

2. Signal Consolidation:

Consolidate link signals from various sources to a single, preferred URL. Let's say you have two URLs:

  • https://www.example.com/blog/article1

  • https://example.com/blog/article1?utm_source=socialmedia

By marking the first URL as canonical, any links pointing to the second URL will contribute to the ranking strength of the canonical one.

3. Simplified Tracking:

Having a single canonical URL for a piece of content makes it easier to track its performance metrics accurately.

4. Efficient Crawling:

Prevent search engine crawlers from wasting time indexing duplicate content. This allows them to focus on discovering new or updated pages on your site.

Best Practices:

  • Avoid using robots.txt for canonicalization: This file is for controlling crawler access, not canonicalization.

  • Don't use the URL removal tool: This will hide all versions of the URL from search, not just the duplicates.

  • Maintain consistency: Use the same canonical URL across different methods (sitemap, rel="canonical", etc.).

  • Don't rely on "noindex": While it prevents indexing, it's not a replacement for proper canonicalization using "rel="canonical".

  • Align with hreflang: When using hreflang for language variations, ensure your canonical URL is in the same language or the closest alternative.

  • Consistent internal linking: Link to your chosen canonical URL throughout your website to reinforce your preference to search engines.

Canonicalization Methods Compared:

Google fully supports explicit rel="canonical" link annotations. However, annotations suggesting alternate versions (using hreflang, lang, media, or type attributes) are ignored for canonicalization. Instead, use appropriate link annotations like rel="alternate" hreflang for language variations.

Two Implementation Methods:

  1. The rel="canonical" link element in HTML

  2. The rel="canonical" link HTTP header

While both are supported, using only one method is recommended to avoid potential errors.

The rel="canonical" Link Element:

This element, placed within the <head> section of your HTML, indicates the canonical URL for that page.

Example:

Let's say https://www.example.com/products/red-shoes is your preferred canonical URL, but the content is also accessible via other URLs.

  1. Add the link element to duplicate pages:

<head>
  <link rel="canonical" href="https://www.example.com/products/red-shoes" />
</head>
  1. For mobile variants: If the canonical page has a separate mobile version, add a rel="alternate" link element on the canonical page:

<head>
  <link rel="canonical" href="https://www.example.com/products/red-shoes" />
  <link rel="alternate" media="only screen and (max-width: 640px)" href="https://m.example.com/products/red-shoes" /> 
</head>
  1. Include other relevant elements: Add hreflang or other necessary annotations as needed.

Key Points:

  • Absolute URLs: Always use complete URLs (e.g., https://www.example.com/page) instead of relative paths (e.g., /page).

  • Valid HTML: Ensure the <link> element is placed within a valid <head> section.

  • JavaScript Injection: If using JavaScript, inject the link element correctly.

The rel="canonical" HTTP Header:

If you have server-side control, you can use the Link HTTP header with a rel="canonical" attribute to define the canonical URL. This method is particularly useful for non-HTML files like PDFs.

Example:

To specify that the PDF version of a document is the canonical, you would add the following header to the corresponding .docx file's response:

Link: <https://www.example.com/document.pdf>; rel="canonical"

Important Notes:

  • Similar to the link element, use absolute URLs within the HTTP header.

  • Enclose the URL in double quotes, as per RFC2616.

Utilizing a Sitemap:

In your sitemap, list the canonical URL for each page. This provides Google with a consolidated list of preferred URLs, although Google still determines duplicates based on content analysis. Sitemaps are a simple way to manage canonicals for large websites and highlight important pages.

Leveraging Redirects:

Use redirects to permanently remove duplicate pages and direct traffic to your chosen canonical URL. 301 (permanent) redirects are the most effective for this purpose.

Example:

If your page has multiple accessible URLs:

  • https://example.com/home

  • https://home.example.com

  • https://www.example.com

Choose one as your canonical and implement 301 redirects from the other URLs to the preferred one.

Other Signals:

Beyond explicit methods, Google also considers:

1. HTTPS Preference:

Google prioritizes HTTPS pages as canonical over HTTP equivalents unless issues exist, such as:

  • Invalid SSL certificate

  • Insecure dependencies

  • Redirects to HTTP

  • rel="canonical" pointing to HTTP

To ensure HTTPS preference:

  • Implement redirects from HTTP to HTTPS.

  • Add rel="canonical" from HTTP to HTTPS.

  • Implement HSTS (HTTP Strict Transport Security).

To avoid incorrect HTTP canonicalization:

  • Use valid SSL certificates and avoid HTTPS to HTTP redirects.

  • Include HTTPS versions in your sitemap and hreflang annotations.

  • Ensure your SSL certificate matches your domain and subdomains correctly.

2. hreflang Clusters:

Google favors URLs within hreflang clusters for localized content.

Example:

If https://example.com/de-de/products and https://example.com/de-ch/products are linked reciprocally with hreflang, but not to https://example.com/de-at/products, the de-de and de-ch pages are preferred over de-at which is not in the cluster.

By understanding and implementing these canonicalization methods and best practices, you can improve your website's SEO performance by providing a clear and consistent signal to search engines about your preferred content.

Last updated