We’ve stated it means again when, however we’ll repeat it: it retains superb us that there are nonetheless individuals utilizing only a robots.txt information to forestall indexing of their web site in Google or Bing. Consequently, their web site exhibits up in the major search engines anyway. Are you aware why it retains superb us? As a result of robots.txt doesn’t truly do the latter, although it does forestall indexing of your web site. Let me clarify how this works on this put up.
For extra on robots.txt, please learn robots.txt: the ultimate guide. Or, discover the best practices for handling robots.txt in WordPress.
There’s a distinction between being listed and being listed in Google
Earlier than we clarify issues any additional, we have to go over some phrases right here first:
- Listed / Indexing
The method of downloading a web site or a web page’s content material to the server of the search engine, thereby including it to its “index.”
- Rating / Itemizing / Displaying
Displaying a web site within the search outcome pages (aka SERPs).
So, whereas the most typical course of goes from Indexing to Itemizing, a web site doesn’t should be listed to be listed. If a hyperlink factors to a web page, area, or wherever, Google follows that hyperlink. If the robots.txt on that area prevents indexing of that web page by a search engine, it’ll nonetheless present the URL within the outcomes if it may well collect from different variables that it could be value taking a look at.
Within the previous days, that would have been DMOZ or the Yahoo listing, however I can think about Google utilizing, as an illustration, your My Enterprise particulars as of late or the previous knowledge from these initiatives. Extra websites summarize your web site, proper.
Now if the reason above doesn’t make sense, take a look at this video rationalization by ex-Googler Matt Cutts from 2009:
When you’ve got causes to forestall your web site’s indexing, including that request to the particular web page you wish to block like Matt is speaking about, continues to be the best method to go.
However you’ll want to tell Google about that meta robots tag. So, if you wish to disguise pages from the major search engines successfully, you want them to index these pages. Despite the fact that that may appear contradictory. There are two methods of doing that.
Stop itemizing of your web page by including a meta robots tag
The primary possibility to forestall the itemizing of your web page is through the use of robots meta tags. We’ve received an final information on robots meta tags which is extra in depth, however it principally comes all the way down to including this tag to your web page:
<meta identify="robots" content material="noindex,nofollow">
When you use Yoast search engine marketing, that is tremendous simple! No want so as to add the code your self. Study how to add a noindex tag with Yoast SEO here.
The problem with a tag like that although, is that it’s a must to add it to each web page.
To make the method of including the meta robots tag to each single web page of your web site a bit simpler, the major search engines got here up with the X-Robots-Tag HTTP header. This lets you specify an HTTP header referred to as
X-Robots-Tag and set the worth as you’ll the meta robots tags worth. The cool factor about that is that you are able to do it for a whole web site. In case your web site is working on Apache, and mod_headers is enabled (it often is), you may add the next single line to your .htaccess file:
Header set X-Robots-Tag "noindex, nofollow"
And this might have the impact that that complete web site can be listed. However would by no means be proven within the search outcomes.
So, do away with that robots.txt file with
Disallow: / in it. Use the X-Robots-Tag or that meta robots tag as an alternative!
Learn extra: The ultimate guide to the meta robots tag »