Benefits of Robots.txt File

A robots.txt file might contain rules like "Disallow: /admin/" (preventing crawlers from accessing the admin area) or "Allow: /" (allowing crawlers to access the entire site). You can protect certain aspects of your site from being crawled. What are your goals in finding out more on this?
 
A robots.txt file might contain rules like "Disallow: /admin/" (preventing crawlers from accessing the admin area) or "Allow: /" (allowing crawlers to access the entire site). You can protect certain aspects of your site from being crawled. What are your goals in finding out more on this?
Dumb question, what prevents a human from pulling up the file and finding your admin section? I haven't used robots.txt at all, if I have - it's been over 10-15 years ago.
 
A robots.txt is a great tool when used correctly.
You can specify sitemaps and control the crawling rates of Google and Bing so that your crawl budget is not wasted on needless items.

While many search engines and tools will respect the robots.txt and honor not crawling other areas, it's not a failsafe, it's a request. Search engines like Yandex and many site analysis tools and AI do not respect the robots.txt guide however.

If you want to protect an area from outside access you need to password protect the area, restrict by IP or other method.

robots.txt does have it's benefits when it comes to crawling and SEO, it's just not to be used as a security blanket to block areas - that doesn't work.
 
By the way, I'll add that in addition to search crawlers, social media platforms crawlers are also interested in the robots.txt file.
The main job of the robots.txt file is to help search engine crawlers focus on pages that you would like to be visible in search results.
Additionally in robots.txt, you can:
  • Exclude private pages from search
  • Prevent resource file indexing
  • Manage website crawlers traffic (by setting up a crawl delay)
  • And of course, Declare your sitemap
 
Back
Top