How to verify your llms.txt.
Publishing an llms.txt is easy. Publishing one that is correct, complete, and actually being fetched is the part that takes a checklist.
Open yourdomain.com/llms.txt in a browser. Confirm it serves with a 200 status, follows the markdown spec (H1, single-sentence description, then grouped link sections), and includes every high-revenue page in your sitemap. Check server logs for fetches from OAI-SearchBot, PerplexityBot, ClaudeBot, and Google-Extended. The free llms.txt generator validates compliance and outputs a fixed version.
What llms.txt is
llms.txt is a markdown file served at the root of a domain that gives AI engines a curated entry point to the site. It was proposed by Jeremy Howard in late 2024 and has been adopted in some form by ChatGPT, Perplexity, and Claude as of Q1 2026. The format is intentionally simple: an H1 with the site name, a one-sentence description, then markdown sections grouping high-value URLs with short context blurbs. The full proposal lives at llmstxt.org. Treat the file as a low-cost positive signal; it is not a guarantee of citation and does not replace robots.txt or sitemap.xml.
Step 1: Confirm the file is served
Open yourdomain.com/llms.txt in a browser. Three checks:
- HTTP status is 200, not 301 redirected or 404.
- Content-Type header is
text/plainortext/markdown, not HTML. - The file renders as plain text in the browser, not styled HTML.
On Shopify, serving a true text/plain file at the root requires either an app, a redirect, or a custom-page template hack. eCommerce Insights's free llms.txt generator outputs a Shopify-compatible deployment with the right content type. See generate llms.txt for my Shopify store for the generation step.
Step 2: Validate spec compliance
A spec-compliant llms.txt has:
- An H1 at the top with the site or brand name.
- A short blockquote or paragraph immediately after, describing the site in one or two sentences.
- One or more H2 sections grouping links. Conventional sections include "Products," "Collections," "Policies," "FAQ," and "About."
- Link lists in markdown bullet syntax:
- [Link text](https://...): brief context. - An optional "Optional" H2 at the end for less critical pages an engine may skip.
Common gotchas: HTML inside the file (engines strip it inconsistently); links to staging URLs (they will resolve to 404 in production); duplicate sections; links wrapped in additional markdown decoration. Run the file through the generator's compliance check.
Step 3: Audit sitemap parity
Compare URLs in llms.txt against sitemap.xml. The two should not be identical — llms.txt is curated, sitemap.xml is exhaustive — but every high-revenue product, every active collection, and every key policy page in the sitemap should appear in llms.txt. Conversely, llms.txt should not contain URLs that are not in the sitemap. Drift is common: a brand publishes llms.txt once and forgets it as the catalog changes. Quarterly review minimum.
Step 4: Check crawler fetches
Open server logs or a Cloudflare access log. Filter for requests to /llms.txt in the last 30 days. The user agents to look for:
- OAI-SearchBot — ChatGPT search crawler.
- PerplexityBot — Perplexity's crawler.
- ClaudeBot — Anthropic's crawler.
- Google-Extended — Google's AI training crawler.
- GPTBot — OpenAI's training crawler.
If none of these have fetched llms.txt in 30 days, the file is unlikely to be influencing AI citation. Common reasons: the file is not actually being served, robots.txt blocks the crawler, or the domain is too small for the crawlers' fetch cadence. The first two are fixable; the third is solved by being patient and growing.
Step 5: Schedule a refresh cadence
llms.txt should refresh whenever the catalog changes meaningfully: new product launches, discontinued SKUs, repositioned collections. Weekly for active D2C catalogs, monthly for stable ones. Manual maintenance is brittle; the eCommerce Insights paid product re-generates and pushes llms.txt weekly.
Crawler adoption status
llms.txt adoption is still uneven across engines as of Q1 2026. A practical read of where each major engine stands:
- ChatGPT (OpenAI). Confirmed fetching llms.txt via OAI-SearchBot. Influence on citation patterns is positive but not yet quantified.
- Perplexity. Documented support; PerplexityBot fetches llms.txt and uses it as a source-discovery hint.
- Claude (Anthropic). ClaudeBot fetches it; Anthropic has acknowledged the convention in public materials.
- Google (Gemini, AI Overviews). No formal commitment as of Q1 2026. Google-Extended fetches the file but Google has not stated whether it influences AI Overviews ranking.
- Copilot (Microsoft). Bingbot fetches it; Microsoft has not formally documented usage.
Treat llms.txt as a low-cost, positive-expected-value signal. Ship the file, monitor fetches, and pair it with the structural improvements (schema, PDP rewrites, review grounding) that drive the bulk of citation lift.
Common mistakes
- HTML wrapping. Some apps render llms.txt inside the storefront HTML layout. AI engines may strip or skip the wrapped version.
- Stale URLs. Discontinued products in llms.txt produce broken citations. Remove on a refresh cycle.
- Over-listing. llms.txt is not a sitemap. Listing 5,000 SKU URLs dilutes the signal.
- No description blurbs. The short context after each link helps the engine understand what it is fetching.
- Missing policies. Shipping and return policy pages are frequently cited; omit them and you miss citations.
An llms.txt that no crawler has fetched in 30 days is a file you wrote for yourself. Useful as a habit. Not yet useful as a signal.
Frequently asked questions
What is llms.txt and which engines actually use it?
Where does the file live on Shopify?
What goes in llms.txt for a Shopify catalog?
How is llms.txt different from sitemap.xml or robots.txt?
How often does eCommerce Insights refresh llms.txt?
Can I block specific crawlers from llms.txt?
Ask AI about verifying llms.txt
Have your favorite AI engine summarize this for your specific use case.
Related jobs
Generate llms.txt
The build step before this verification step.
DeveloperAdd JSON-LD to Shopify product
The schema half of AI-readable Shopify.
SEOAI-ready PDP checklist
The full PDP-level audit.
Related tools
- llms.txt generator — free, Shopify-compatible.
- AEO Grader — confirms crawl access alongside schema and content.
Generate, verify, and refresh llms.txt automatically.
eCommerce Insights keeps the file in sync with your Shopify catalog every week.