Does blocking GPTBot keep my business out of ChatGPT answers?

Not directly. GPTBot collects training data. OAI-SearchBot and ChatGPT-User are the bots behind ChatGPT's live search answers. You can block GPTBot and still appear in ChatGPT search, as long as the search bots are allowed. The common mistake is pasting one snippet that blocks all three.

Is robots.txt enough to control AI crawlers?

No. robots.txt is a request, not a wall. Compliant bots respect it, others ignore it. And your CDN can override it in the other direction: since July 2025 Cloudflare blocks AI crawlers by default for new customers, even when robots.txt allows them. Check both layers.

Will allowing AI crawlers hurt my Google rankings?

No. Googlebot, which indexes your site for search, is a separate user agent and is not affected by rules you set for GPTBot or PerplexityBot. If you want to stay in Google search but out of Gemini training, block Google-Extended. Your search indexing continues either way.

Is your website blocking AI crawlers? A five-minute check

Cloudflare now blocks AI crawlers by default. How to check your robots.txt and CDN settings so ChatGPT, Claude, and Perplexity can read your site.

Is your website blocking AI crawlers without you knowing?

Your website may be blocking AI crawlers right now without you knowing. Since July 1, 2025, Cloudflare, which handles roughly 20% of all web traffic, blocks AI crawlers by default for new customers. If nobody on your team changed that setting, ChatGPT, Claude, and Perplexity may be locked out of your site.

Here’s the thing. Most GEO advice starts with content: write FAQ pages, add schema, earn citations. All useful. But none of it counts if the AI crawler never gets through the door. The door check comes first, and almost nobody runs it.

It is not just the CDN. A lot of robots.txt files got a “block all AI bots” snippet in 2023 or 2024, back when the only question was training data. A Reuters Institute study found 48% of the most widely used news sites across ten countries block OpenAI’s crawlers, and many never distinguished the training bot from the search bots that send buyers back.

How do I check if my website is blocking AI crawlers?

To check if your website is blocking AI crawlers, open yourdomain.com/robots.txt and look for GPTBot, ClaudeBot, OAI-SearchBot, or PerplexityBot under a Disallow rule. Then log into your CDN, Cloudflare for most small businesses, and check the AI bot blocking setting. Five minutes, no developer needed.

Step by step:

Type your domain into the browser and add /robots.txt to the end.
Scan for the bot names above. A Disallow: / line under any of them means that bot is told to stay out of your whole site.
Watch for the wildcard. A User-agent: * followed by Disallow: / blocks everything, AI bots included.
Log into Cloudflare (or your CDN), open the bot settings, and look for the AI crawler block. On new accounts it is on by default.
If you have server logs, search them for “GPTBot” or “PerplexityBot”. Rows of 403 responses mean your server is turning the bots away regardless of what robots.txt says.

The CDN step is the one everybody skips. Your robots.txt can say “come in” while the firewall in front of it says no, and from the outside both look identical: the AI simply never mentions you.

Which AI crawlers should I allow?

Allow the search crawlers that put your name in AI answers: OAI-SearchBot and ChatGPT-User for ChatGPT, Claude-SearchBot for Claude, PerplexityBot for Perplexity. Training bots like GPTBot and CCBot are a separate choice. Cloudflare Radar data from June 2026 shows GPTBot is the most blocked AI crawler, and many sites block every OpenAI bot in one go.

The distinction that matters: some bots collect training data and send you nothing back. Others power live search answers, and those answers are where your buyers are deciding. Block the first group if you like. Blocking the second group makes you invisible.

Crawler	Operator	What it does	Allow it?
OAI-SearchBot	OpenAI	Powers ChatGPT search answers	Yes
ChatGPT-User	OpenAI	Fetches pages live when a user asks	Yes
Claude-SearchBot	Anthropic	Powers Claude web search (new in 2026)	Yes
PerplexityBot	Perplexity	Powers Perplexity answers	Yes
GPTBot	OpenAI	Collects model training data	Your call
ClaudeBot	Anthropic	Collects model training data	Your call
CCBot	Common Crawl	Open training dataset	Your call

In a single-day sample of 4,047 robots.txt files parsed by Cloudflare on March 30, 2026, 13.8% mentioned GPTBot in their rules, more than any other AI crawler. The internet is genuinely split on training bots. Fine. Just don’t let that argument cost you the search bots.

Why check crawler access before any other GEO work?

Check crawler access before any other GEO work because a blocked crawler makes the rest worthless. Schema, FAQ pages, citations: none of it counts if the bot never reads the page. And the traffic at stake is good traffic. AI search visitors are worth 4.4x as much as organic search visitors, per Semrush research from June 2025.

One honest caveat on that number: Semrush measured it across 500+ digital marketing topics, a field where AI adoption runs hottest. Your category may sit lower. The direction still holds, because an AI-referred visitor has already compared options before they click.

So the order of operations is: door first, content second. Once the bots can read you, the actual GEO work begins. We covered that side in the GEO primer.

How do I fix a robots.txt that blocks AI crawlers?

To fix a robots.txt that blocks AI crawlers, remove or change the Disallow lines for the search bots you want in, then re-test the file. In Cloudflare, set the AI crawler control to allow verified search bots. Re-check after every site migration or plugin update, because these settings reset quietly.

A clean setup that keeps your training-data choice separate from your visibility looks like this:

# Search bots — these put you in AI answers
User-agent: OAI-SearchBot
Allow: /

User-agent: ChatGPT-User
Allow: /

User-agent: Claude-SearchBot
Allow: /

User-agent: PerplexityBot
Allow: /

# Training bot — block it or not, your choice
User-agent: GPTBot
Disallow: /

Two warnings from sites we’ve looked at. First, rule order and wildcards trip people up: not every crawler resolves a specific user-agent rule against a User-agent: * block the same way, so test the file after editing. Second, robots.txt is voluntary. It controls the polite bots. The CDN setting is what actually enforces, in both directions.

Then verify the result the way your buyers would: ask ChatGPT and Perplexity about your category and see if you’re named. We run that exact count, dated and engine by engine, in the free AI-visibility audit.

Sources: Cloudflare announcement, July 1 2025 — AI crawlers blocked by default for new customers; Semrush AI search traffic study, June 9 2025; Cloudflare Radar robots.txt data via TechnologyChecker.io, June 2026; Reuters Institute, news websites blocking AI crawlers