Jobs to be done · Diagnose · SEO · Dev

How to fix a robots.txt that's blocking AI crawlers.

Did your agency block GPTBot two years ago and never tell you? It happens more than anyone admits: a blanket AI-bot block added during the 2023 scraping panic, still live, silently zeroing your visibility on every engine it covers. Nobody audits robots.txt — until the citations are gone.

Quick answer

Paste your store URL into the free Agentic Readiness Grader — crawler admittance is checked per bot as part of the agent-readability score, with the exact lines to change attached to every deny. Or read yourstore.com/robots.txt yourself against the bot list below.

The slow way: read the file, know the bots

This is one of the few jobs where the manual method is genuinely quick — if you know what to look for. Open yourstore.com/robots.txt in a browser and search for each AI user-agent. The list that matters for D2C as of mid-2026: GPTBot, OAI-SearchBot, ChatGPT-User (OpenAI), ClaudeBot and anthropic-ai (Anthropic), PerplexityBot, Google-Extended, Applebot-Extended, and CCBot. A Disallow: / under any of these is a hard zero for that engine's crawl.

The catches that make the manual read unreliable: rule ordering and wildcard interactions (a later User-agent: * block can behave differently per crawler), CDN-level bot management that blocks with a 403 even when the file says allow, and Shopify apps that regenerate the template. You also need to know the current bot names — OpenAI alone operates three with different purposes, documented in OpenAI's bot reference. The file is easy to read and easy to misread.


The eCommerce Insights way

  1. Run the grader. The Agentic Readiness Grader checks admittance per bot — actual fetch behavior, not just file text — and renders an allow/deny ledger. Edge-level blocks that contradict the file get caught here.
  2. Read the deny list. Each denied bot is mapped to what it costs you: GPTBot denied means ChatGPT can't crawl your PDPs; Google-Extended denied means Gemini and AI Overviews lose grounding access; PerplexityBot denied removes you from the 3–7 citation slots per shopping answer.
  3. Apply the fix lines. The grader emits the exact directives. On Shopify, add them via robots.txt.liquid (Online Store → Themes → Edit code). Typical policy: allow the research and shopping crawlers above; many stores keep blocking Bytespider, which brings load without D2C upside.
  4. Verify the live file and keep watch. Re-fetch, rerun the grader, and confirm a 200 with the new rules. On a paid plan, crawler admittance is re-checked on every refresh, so the next silent agency edit or app regression triggers an alert instead of a quarter of invisible decline.

Robots.txt is one of four signal groups in the agent-readability score — the others are covered in check if AI agents can read my PDPs. For the full bot-by-bot policy discussion, read the robots.txt for AI crawlers post.

What "good" looks like

Shopping-relevant AI bots admitted (OpenAI ×3, Anthropic ×2, Perplexity, Google-Extended, Applebot, CCBot)allow
High-load, low-value crawlers (e.g. Bytespider)deny
Edge/CDN rules consistent with the fileverified
Re-check cadence after agency or app changesweekly

After unblocking, crawlers typically return within days; citation recovery follows over two to six weeks as retrieval indexes refresh (eCommerce Insights observations, early 2026). Watch the recovery per SKU rather than assuming the edit landed.

Ask AI about this job

Have your favorite AI engine apply this walkthrough to your store.

Frequently asked questions

Which AI crawlers should an ecommerce store allow?
For D2C visibility as of mid-2026: GPTBot and OAI-SearchBot (OpenAI training and search), ChatGPT-User (live browsing), ClaudeBot and anthropic-ai (Anthropic), PerplexityBot, Google-Extended (Gemini/AI Overviews grounding), Applebot and Applebot-Extended, and CCBot (Common Crawl, which feeds many models). Many stores choose to block Bytespider (ByteDance) — it generates heavy load with little D2C upside.
How did our robots.txt end up blocking AI crawlers?
Usually one of three ways: an agency added a blanket AI-bot block in 2023–24 when scraping fears peaked and never revisited it; a CDN or bot-management rule (Cloudflare's AI-bot toggle, for example) blocks at the edge so the file looks fine but requests still 403; or a platform default shipped restrictive and nobody checked. The fix is one edit — finding out it is needed is the hard part.
Does blocking GPTBot stop my products appearing in ChatGPT entirely?
It removes your PDPs from OpenAI's crawl, so answers about your products fall back to third-party sources — review sites, marketplaces, competitors' comparisons. The brand may still be mentioned from training data, but your pages cannot be cited and your facts cannot be corrected. For a commerce site, blocking the shopping-relevant bots is self-sabotage.
How do I edit robots.txt on Shopify?
Shopify generates robots.txt from a theme template. In the admin: Online Store → Themes → Edit code → Add a new template → robots.txt.liquid, then add the per-bot rules. Shopify's default is permissive toward most AI bots as of mid-2026, but apps and custom templates override it — always verify the live file.
How fast does visibility recover after unblocking?
Crawlers typically re-fetch within days of the change; citation recovery follows over two to six weeks as retrieval indexes refresh, based on eCommerce Insights observations as of early 2026. Track the recovery per SKU with weekly monitoring rather than assuming the edit worked.

Find out which bots your store turns away.

Per-bot admittance, checked free in 30 seconds.