How to limit bots with robots.txt
Many search engine crawlers and bots, such as Googlebot, respect a robots.txt file in the root directory of your website. The robots.txt file instructs search engine crawlers or other bots which pages or files to crawl or ignore.
Here are several common examples of how to use robots.txt:
Block a directory
Section titled “Block a directory”User-agent: *Disallow: /private/Block a specific bot
Section titled “Block a specific bot”User-agent: BadBotDisallow: /Allow all bots to access everything
Section titled “Allow all bots to access everything”User-agent: *Allow: /Block one bot, allow others
Section titled “Block one bot, allow others”User-agent: BadBotDisallow: /
User-agent: *Allow: /Crawl-Delay
Section titled “Crawl-Delay”You can also use Crawl-Delay to add a delay to bots:
Add a delay to all bots
Section titled “Add a delay to all bots”User-agent: *Crawl-Delay: 5Add a delay to a specific bot
Section titled “Add a delay to a specific bot”User-agent: BadBotCrawl-Delay: 10Other options
Section titled “Other options”You may find that not all bots respect robots.txt, and you may need to take different measures: