15 min

Let's see if we are a good fit. Schedule a time to chat that works best for you

Robots.txt noindex won’t find any
support from Google anymore

Robots.txt noindex won’t find any
support from Google anymore

Google officially “divorced” Robots.txt noindex and will no longer endure its caprices. For all the remaining procedures they gave each other time till September 1, 2019. But it’s obvious that they will definitely get divorced. It’s all because Google wants to focus on looking at unsupported implementations of the internet draft; that includes crawl-delay, nofollow and noindex. Moreover, there is one more equally important reason for this kind of divorce: being unofficial.

No more robots.txt noindexing | All knew. No one spoke.

All knew that robots.txt directive was not an official one. Google also knew and for a long time it was preparing us for the bad news. Nevertheless, they continued to live in peace until July 2, 2019, when there was the official announcement, saying: “Since these rules were never documented by Google, naturally, their usage in relation to Googlebot is very low,” Google said. “These mistakes hurt websites’ presence in Google’s search results in ways we don’t think webmasters intended.” Besides the announcement, Google released its robots.txt parser as an open source project.


Who / What will replace robots.txt?

By divorcing robots.txt Google will not leave its “kids” without a shelter. Here are the official alternative options:

  • Noindex in robots meta tags: Supported both in the HTTP response headers and in HTML, the noindex directive is the most effective way to remove URLs from the index when crawling is allowed.
  • 404 and 410 HTTP status codes: Both status codes mean that the page does not exist, which will drop such URLs from Google’s index once they’re crawled and processed.
  • Password protection: Unless markup is used to indicate subscription or paywalled content, a page behind a login will generally remove it from Google’s index.
  • Disallow in robots.txt: Search engines can only index pages that they are aware of, so blocking the page from being crawled usually means that the existing content won’t be indexed.  While the search engine, based on links from other pages, may also index a URL without seeing the content itself, we aim to make such pages less visible in the future.
  • Search Console Remove URL tool: The tool is a quick and easy method to remove a URL temporarily from Google’s search results.


How to save existing robots.txt?

It’s simple. It just needs some time and a little effort. As it’s mentioned above there are 5 main options serving as alternatives. To be shortly speaking you must know the follwing: if the page is already indexed, then use meta noindex. If the page is still not indexed, you can disallow the robots from crawling it.

Google still cares as it is important to make sure we do not the noindex directive in the robots.txt file. If still we do, it’s better for us to fix our robots.txt files before September 1. Otherwise there will be a lot of penalties.


If you have more free time read the official announcement document here.




You might also be interested