All AI Robots.txt Rules
The following are all AI Robot.txt rules from our presentation “How to Protect Your Content from AI… and Should You?” They include all of the user agents that you should consider blocking in one code snippet. We will do our best to maintain this AI Bot code and add and remove rules as needed. We recommend that you read the blog post to get the best perspective on how to protect your content from AI models.
How many organizations are blocking AI Models? By March 2024, 33% of the Top 1000 websites blocked GPTBot.
Robots.txt Code Snippets
We think it’s best to break down whether or not you want your content scraped or if it is a potential answer for search or assistance. To add rules to block AI models from accessing your website in the robots.txt file, copy and paste the following code snippet into your website’s robots.txt file:
AI Scrapers Rules
These are used to scrape and add to AI Models
# Block Open AI
User-agent: GPTBot
Disallow: /
# Block Google (Gemini)
User-agent: Google-Extended
Disallow: /
# Block Claude
User-agent: anthropic-ai
Disallow: /
User-agent: Claude-Web
Disallow: /
User-agent: Claudebot
Disallow: /
# Block CommonCrawl
User-agent: CCBot
Disallow: /
# Block Diffbot
User-agent: Diffbot
Disallow: /
# Block Meta (Facebook)
User-agent: FacebookBot
Disallow: /
# Block ByteDance
User-agent: Bytespider
Disallow: /
# Block Webz.io
User-agent: Omgilibot
Disallow: /
User-agent: Omgili
Disallow: /
# Block ImagesiftBot
User-agent: ImagesiftBot
Disallow: /
# Block Meltwater
User-agent: Meltwater
Disallow: /
# Block Seekr
User-agent: Seekr
Disallow: /
AI Assistants & AI Crawler Rules
These are used to answer questions from AI Assistants or are integrated into the responses for AI Search results
# Block Open AI
User-agent: ChatGPT-User
Disallow: /
# Block Perplexity
User-agent: PerplexityBot
Disallow: /
# Block Amazon
User-agent: Amazonbot
Disallow: /
# Block Apple
User-agent: Applebot
Disallow: /
# Block Cohere
User-agent: cohere-ai
Disallow: /
# Block You.com
User-agent: YouBot
Disallow: /
After adding these lines to your robots.txt file and uploading it to your website’s root directory, the specified AI user agents will be instructed not to crawl or access any pages on your site. Remember to review and update these rules periodically as new AI models and user agents emerge to ensure your content remains protected.
Find out about more AI Agents
What is robots.txt?
Think of the robots.txt file as a website gatekeeper. It provides instructions to search engines and bots about which areas of the site they are allowed to access and index. This file is crucial for managing site visibility and protecting sensitive information from being crawled and displayed in search results, thereby helping to control traffic and the load on the website’s servers.
What are User Agents?
In the robots.txt file, the user agents identify specific web crawlers or bots, allowing site administrators to tailor access permissions individually. By specifying user agents, one can selectively restrict or grant access to different parts of a website, ensuring that only desired bots can index or interact with specific content.
Will AI User Agent rules block Search Engines or Social Sharing?
We have specifically selected AI user agents that are not related to Search or Social sharing. For example: Google-Extended is for Google’s AI Model Bot, where GoogleBot is used for general search. In the future, this may change, but we will update our AI Model code snippet accordingly.
Find out whether AI Bots are looking at your website
(Last Updated: February 25, 2025)
DOWNLOAD THE B2B WORDPRESS SECURITY CHECKLIST
Read Other Technical Snippets
[email protected]
Need help implementing AI Robots.txt Rules?