Using Robots Meta Tag, data-nosnippet, and X-Robots-Tag noindex
Controlling Content in Google Search Results: A Technical Guide
This document explains how to manage how Google displays your content in search results using page-level and text-level settings.
Page-level settings are defined using either:
robots meta tag: Placed within the
<head>
section of individual HTML pages.X-Robots-Tag HTTP header: Implemented in the server response for a URL.
Text-level settings use the data-nosnippet
attribute within HTML elements to control the display of specific content within a page.
Important: These settings only work if Google's crawlers can access your pages. Blocking crawlers prevents them from discovering these instructions.
Blocking Non-Search Crawlers
While this document focuses on Google Search, you can block other bots like AdsBot-Google
using specific rules. For instance, in your robots.txt file:
This would block AdsBot-Google
from accessing the /private-directory/
on your website.
1. Using the Robots Meta Tag
The robots meta tag provides granular, page-specific control over indexing and serving in Google Search.
Implementation:
In this example, the noindex
value instructs all search engines to exclude the page from search results.
Key Points:
Place the robots meta tag within the
<head>
section of your HTML.Both the
name
andcontent
attributes are case-insensitive.
CMS Users:
Content Management Systems (CMS) like Wix, WordPress, or Blogger often offer built-in settings for managing meta tags. Look for options related to "Search Engine Optimization" or "SEO Settings."
Targeting Specific Crawlers:
Google supports specific user agent tokens within the robots meta tag:
googlebot
: Controls indexing and serving for Google's general web search results.googlebot-news
: Controls inclusion in Google News.
Examples:
Exclude a page from Google Search:
Exclude a page from Google News:
Exclude a page from both Google Search and Google News:
Multiple Crawlers, Different Rules:
This instructs Googlebot to not index the page while only preventing snippets from appearing in Google News.
Important: For blocking non-HTML resources (PDFs, images, videos), use the X-Robots-Tag instead.
2. Using the X-Robots-Tag HTTP Header
The X-Robots-Tag provides similar control to the robots meta tag but is implemented within the HTTP header response for a URL.
Example:
This instructs all crawlers to not index the page.
Key Points:
Any rule applicable in the robots meta tag can be used with the X-Robots-Tag.
Multiple X-Robots-Tag headers can be used within a single response.
Rules are not case-sensitive.
Multiple Rules:
This example combines noarchive
(prevents caching) and unavailable_after
(sets an expiration date).
Targeting Specific Crawlers:
Here, googlebot
is instructed to not index the page, while bingbot
is told not to cache it.
Conflicting Rules:
When conflicting rules are present, the more restrictive rule always takes precedence. For example, nosnippet
will override max-snippet:50
.
3. Indexing and Serving Rules
Both the robots meta tag and X-Robots-Tag support a set of rules to control indexing and snippet generation:
all
Default behavior; no restrictions.
noindex
Prevents the page from appearing in search results.
nofollow
Prevents Google from following links on the page.
none
Equivalent to combining noindex
and nofollow
.
noarchive
Prevents Google from showing a cached link in search results.
nositelinkssearchbox
Prevents a sitelinks search box from appearing with the site's results.
nosnippet
Prevents a text snippet or video preview from appearing in search results. This also affects AI-powered summaries.
indexifembedded
Allows indexing of a page's content even with noindex
if embedded within another page (e.g., using iframes).
max-snippet:[number]
Limits the text snippet length to the specified number of characters. Using 0
is equivalent to nosnippet
, while -1
allows Google to choose the length.
max-image-preview:[setting]
Controls the maximum size of image previews. Options include none
, standard
, and large
. Note: This doesn't override specific permissions given for content use (e.g., through structured data).
max-video-preview:[number]
Sets the maximum duration (in seconds) of video previews in search results. Similar to max-snippet
, 0
minimizes the preview, and -1
allows Google to decide.
notranslate
Prevents Google from offering translations of the page title and snippet in search results.
noimageindex
Prevents images on the page from being indexed.
unavailable_after:[date/time]
Excludes the page from search results after the specified date and time. Use a standard date/time format (RFC 822, RFC 850, ISO 8601).
Combining Rules:
Multiple rules within a single tag/header:
Multiple meta tags:
4. Using the data-nosnippet HTML Attribute
For finer control over snippets, use the data-nosnippet
attribute within HTML elements like <span>
, <div>
, and <section>
.
Example:
Key Points:
data-nosnippet
is a boolean attribute; its presence alone is enough to take effect.Ensure your HTML is valid and all tags are properly closed.
Avoid dynamically adding or removing
data-nosnippet
using JavaScript after page load, as it might not be reliably recognized.
5. Using Structured Data
While robots meta tags control general content extraction, structured data using schema.org vocabulary provides specific information to Google for enhanced search features.
Key Points:
Robots meta tag rules, except for
max-snippet
, do not affect structured data.Use structured data to enhance search results with rich snippets and features.
Example:
Even with nosnippet
applied, a page with properly implemented recipe structured data can still appear in recipe carousels.
6. Practical Implementation of X-Robots-Tag
The X-Robots-Tag is implemented within your web server's configuration files.
Apache Examples:
Blocking PDF indexing:
Blocking image indexing:
NGINX Example:
Blocking PDF indexing:
7. Interaction with robots.txt
Remember: Robots meta tags and X-Robots-Tag headers are discovered during crawling. If a URL is disallowed in your robots.txt file, these instructions will be ignored. Ensure that pages with these directives are accessible to crawlers.
Last updated