How to Configure Magento 2 Robots.txt?

Magento 2 robots.txt file is one of the most important Magento SEO features. This file is a set of instructions given to the web crawlers “guiding” them on which pages of your website to crawl and index.

It mainly consists of the allowed and disallowed instructions that are used to block some pages from being indexed by search crawlers. Confused already?  

In this guide, you'll learn what is Magento 2 robots.txt, where you can find and configure it. We'll also share the best practices for the robots.txt. So stay tuned.

What is Magento 2 Robots.txt File?

Magento robots.txt. is a text file used to instruct web crawlers on how to crawl pages on your website. It's essential to establish kind of a relationship between your website and crawlers. When you configure Magento robots.txt file you set instructions for web robots on what pages of your website to index or skip from indexing.

Robots.txt Best Practices

Before we move to the configuration you need to make sure the Magento robots.txt is optimized for SEO. Use these best practices: 

  • Restrict access to sensitive content — don't allow search crawlers to access confidential directories or CMS directories. 
  • Don't allow indexing of irrelevant pages — make sure to allow and disallow only relevant pages to be indexed to optimize the crawl budget.
  • Include XML sitemap — you need to add the sitemap to the robots.txt so the robots can find all relevant pages of your store. 
  • Troubleshoot the issues — check Google Search Console for errors with robots.txt to ensure uninterrupted indexing.

(47)

Magento 2 SEO Extension is the all-in-one solution for your Magento SEO needs. It allows you to improve your store positions in SERP, boost ranking, increase traffic, optimize website visibility in search and get...

$99
one-time payment
Buy Now Live Demo

How to Configure Magento 2 Robots.txt?

Now that you know what is best to allow and disallow in your robots.txt, it's time you configure the file.

1. Navigate to Content > Design > Configuration.

2. Choose the Website you want to configure the Magento robots.txt file for.

Note: the robots.txt option is not available on a store view level, only for websites or global.

Magento 2 design configuration store

3. Find the Search Engine Robots section.

4. Choose the Default Robots you want to be enabled on your website.

There are several options you can choose from. Let's see what each robots meta tag option stands for in Magento 2 robots.txt:
1INDEX, FOLLOW
— if you want web crawlers to index a page and follow the links on that page.
2NOINDEX, FOLLOW
— if you don’t want web crawlers to index a page but want them to follow the links on that page.
3INDEX, NOFOLLOW
— if you want web crawlers to index a page and don't want them to follow the links on that page.
4NOINDEX, NOFOLLOW
— if you want web crawlers neither to index a page nor to follow the links on that page.

5. Enter the custom instruction for search crawlers to analyze your website in the Edit custom instruction of the robot.txt File section. Check the examples of the custom robots.txt instructions later in this article.

6. Click the Reset To Defaults button in case you want to delete all your custom instructions and go back to the default ones.

Magento 2 robots.txt

Once you Save the configuration, you also need to flush the cache.

Pro tip: if your website is under development, you might want to restrict its indexing by web robots. To do that you have to choose NOINDEX, NOFOLLOW in the Default Robots field.

Example of Magento 2 Robots.txt Instructions

You usually need to "hide" some custom, sensitive and irrelevant content from search crawlers. For that you need custom instructions in the Magento robots.txt file. Here is the example: 

# Define user agents/bots
User-agent: *
User-agent: Googlebot
User-agent: Googlebot-image

# Urls with parameters
Disallow: /*?
Allow: /*?page=
Allow: /*?p=
Allow: /graphql?
Disallow: /*SID=

# Technical paths
Disallow: /repo/
Disallow: /catalog/product_compare/
Disallow: /catalog/category/view/
Disallow: /catalog/product/view/
Disallow: /catalog/seo_sitemap/
Disallow: /catalogsearch/
Disallow: /mfproductsearch/
Disallow: /checkout/
Disallow: /control/
Disallow: /customer/
Disallow: /customize/
Disallow: /sendfriend/
Disallow: /ajaxcart/
Disallow: /ajax/
Disallow: /quickview/
Disallow: /productalert/
Disallow: /mfcmsdr/
Disallow: /sales/guest/form/
Disallow: /sales/guest/form/
Disallow: /review/
Disallow: /downloadable/
Disallow: /pslogin
Disallow: /subscription
Disallow: /newsletter
Disallow: /push_notification

# Files
Disallow: /index.php
Disallow: /cron.php
Disallow: /cron.sh
Disallow: /error_log
Disallow: /install.php
Disallow: /LICENSE.html
Disallow: /LICENSE.txt
Disallow: /LICENSE_AFL.txt
Disallow: /STATUS.txt
Disallow: /get.php
Disallow: /app/
Disallow: /lib/
Disallow: /*.php$
Disallow: /pkginfo/
Disallow: /report/
Disallow: /var/

#CMS Pages
Disallow: /privacy-policy-cookie-restriction-mode
Disallow: /no-route
Disallow: /enable-cookies
Disallow: /home

How to Add Sitemap to Robots.txt in Magento 2?

Similar to robots.txt, Magento sitemap play an important role in your SEO. It allows search engines to analyze your website links better. And since robots.txt is a set of instructions for what to analyze, you should also add a sitemap to this file.

To add sitemap to Magento robots.txt:

1. Navigate to Store > Configuration > Catalog > XML Sitemap and find the Search Engine Submission Settings section.

2. Enable Submission to Robots.txt option.

Magento 2 XML sitemap configuration

In case you want to add custom XML sitemap to robots.txt, navigate to Content > Design > Configuration > choose a website > Search Engine Robots. Then add a custom sitemap at the end of the "Edit custom instruction of the robot.txt File" field, as in this example:

Sitemap: https://magefan.com/pub/sitemaps/blog_sitemap.xml
Sitemap: https://magefan.com/pub/sitemaps/blog_sitemap_ua.xml

Important: your sitemap shouldn't include pages you disallow indexing in the robots.txt. for that check our guide on how to exclude pages from XML sitemap.