🔎
Google Search for beginners
Home
  • Introduction
  • Google Search Essentials
    • Overview
    • Google Search Technical Requirements
    • Spam Policies
  • SEO Basics
    • SEO Beginner's Guide
    • How Google Search Works
    • Creating Helpful, Reliable Content
    • Do You Need an SEO Expert?
    • Maintaining Your Website’s SEO
    • Developer's Guide to Google Search
    • How to Get Your Website Listed on Google
  • crawling and indexing
    • Overview
    • File formats Google can index
    • URL structure
    • Links
    • Sitemaps
      • Create and submit a sitemap
      • Manage your sitemaps
      • Image-specific sitemaps
      • News-oriented sitemaps
      • Video sitemaps and alternatives
      • Combining different sitemap types
    • Managing Google Crawlers
      • Reducing the crawl rate of Googlebot
      • Verifying the Googlebot and other Google crawlers
      • Managing Crawl Budget for Large Sites
      • HTTP Status Codes, Network, and DNS Errors
      • Types of Google Crawlers
      • Googlebot Explained
      • Google Read Aloud Service
      • Google API
      • Understanding Feedfetcher
    • Robots.txt
      • Creating and Submitting Robots.txt
      • Updating Robots.txt
      • Google's Interpretation of Robots.txt
    • Canonicalization
      • Specifying Canonicals Using rel="canonical" and Other Methods
      • Resolving Canonicalization Issues
    • Canonicalization for Mobile Sites and Mobile-First Indexing
    • AMP (Accelerated Mobile Pages)
      • Understanding How AMP Works in Search Results
      • Enhancing Your AMP Content
      • Validating AMP Content
      • Removing AMP Content
    • JavaScript
      • Fixing Search-Related JavaScript Issues
      • Resolving Issues with Lazy-Loaded Content
      • Using Dynamic Rendering as a Workaround
    • Page and Content Metadata
      • Meta Tags
      • Using Robots Meta Tag, data-nosnippet, and X-Robots-Tag noindex
      • noindex Explained
      • rel Attributes
    • Removals
      • Removing Pages from Search Results
      • Removing Images from Search Results
      • Handling Redacted Information
    • Redirects and Google Search
      • Switching Website Hosting Services
      • Handling URL Changes During Site Moves
      • A/B Testing for Sites
      • Pause or Disable a Website
Powered by GitBook
On this page
  1. crawling and indexing
  2. Robots.txt

Updating Robots.txt

Updating Your Robots.txt File: A Comprehensive Guide

The robots.txt file acts as a gatekeeper for search engines, instructing them on which parts of your website they are allowed to crawl and index. This document provides a detailed walkthrough on how to update your robots.txt file, ensuring optimal search engine visibility for your website.

Before You Begin:

  • Website Builders: If you use platforms like Wix, Squarespace, or Blogger, you might not be able to directly edit your robots.txt file. These platforms often provide built-in settings to control search engine access. Search their help documentation for instructions (e.g., search for "Wix control search engine indexing").

Updating Your Robots.txt File:

1. Download Your robots.txt File:

Begin by obtaining a copy of your existing robots.txt file. Here are some common methods:

  • Direct Access: Navigate to https://www.yourwebsite.com/robots.txt (replacing "yourwebsite.com" with your actual domain) in your browser. Copy the entire content and paste it into a new text file on your computer. Save the file as "robots.txt".

  • cURL: Use the command-line tool cURL to download the file:

    curl https://www.yourwebsite.com/robots.txt > robots.txt
  • Google Search Console: Log in to your Google Search Console account. In the left-hand menu, navigate to "Index" > "robots.txt Tester". You'll find a copy of your robots.txt file that you can copy and paste into a text editor.

2. Edit Your robots.txt File:

Open the downloaded "robots.txt" file in a plain text editor (like Notepad on Windows or TextEdit on Mac). Here are some common edits you might make:

  • Disallowing Access to a Specific Directory: To prevent search engines from indexing the content within your "images" directory, add the following line:

    User-agent: *
    Disallow: /images/
  • Disallowing Access to a Specific File: To block a specific file, like a PDF document named "confidential.pdf", use:

    User-agent: *
    Disallow: /confidential.pdf 
  • Allowing Access: By default, search engines assume they can access everything unless told otherwise. However, if you've previously disallowed access and want to grant it back to a specific directory or file, you can use the "Allow" directive. For example, to allow access to a subfolder "public" within your "documents" directory:

    User-agent: *
    Allow: /documents/public/
  • Specifying Crawl Delay: If you want to control how fast a search engine crawls your website, use the "Crawl-delay" directive (though not all search engines obey this). For a 10-second delay:

    User-agent: Googlebot 
    Crawl-delay: 10
  • Specifying Sitemap Location: Help search engines discover all your pages by specifying your sitemap's location:

    Sitemap: https://www.yourwebsite.com/sitemap.xml

    Important:

    • Each line in your robots.txt file represents a single rule.

    • The * in User-agent: * refers to all search engine bots.

    • For specific search engine bots, use their specific names like "Googlebot" or "Bingbot".

    • Ensure your file uses UTF-8 encoding to prevent character interpretation issues.

3. Upload Your Robots.txt File:

Once you've made the necessary changes and saved your robots.txt file, upload it to the root directory of your website. This is typically done via FTP or through your web hosting control panel.

Important:

  • The file must be named "robots.txt" (lowercase) and placed in the top-level directory of your website.

  • If you encounter difficulties, consult your hosting provider's documentation for specific instructions on uploading files.

4. Refresh Google's robots.txt Cache:

  • While Google automatically detects changes in your robots.txt file during regular crawls, you can speed up the process using the "Request a recrawl" feature within the "robots.txt Tester" tool in Google Search Console.

By following these steps, you can effectively manage how search engines interact with your website, ensuring that the right content gets indexed and that your website's resources are used efficiently.

PreviousCreating and Submitting Robots.txtNextGoogle's Interpretation of Robots.txt

Last updated 11 months ago