Glossary entry

llms.txt

A proposed plain-text file at a site's root that gives LLM crawlers a curated, ranked reading list instead of forcing them to reconstruct the site cold.

Last updated June 2026

The format and its status

The convention is documented at llmstxt.org: an H1 heading naming the site, a one-paragraph blockquote summary, then markdown sections of curated links, each with a short description written for a machine deciding whether to fetch. It is proposer-led, not standards-body-led, as of mid-2026 — adoption has spread across SaaS and content sites, several AI vendors have acknowledged the file in public notes, and no engine guarantees consumption. The cost of shipping one is a few hundred lines of static text, which is why the upside-versus-cost math favors having it even while the convention matures.

Why it matters for ecommerce

Ecommerce platforms emit sprawling structures by default — product pages, collection pages, tag archives, policy pages. A crawler reading the site cold spends its budget on pages the brand does not especially want surfaced. A curated llms.txt nudges the crawler toward the PDPs and guides the brand wants cited, in the brand's own ranking.

It is also the fastest signal for changes: when a brand rebuilds PDPs or reorganizes collections, updating llms.txt is a one-line change, while waiting for the engine to re-derive the site's hierarchy can take weeks. For Shopify stores — where theme structure is not fully controllable — the file is one of the few crawler-guidance levers entirely in the merchant's hands.

A curated file in practice: an example

A climbing-rope brand's llms.txt opens with a two-sentence brand description, then sections for "Flagship ropes," "By discipline," "Gear care," and "About," each with five to eight curated links. When an LLM crawler retrieves the file it has a ranked reading list: the 9.2mm flagship PDP, the alpine collection, the rope-care guide. That ordering shapes which pages enter the engine's retrieval set — and, weeks later, which URLs get cited when a shopper asks for rope recommendations. The file is regenerated whenever flagship URLs change, as part of the release checklist.

How it relates to neighboring terms

llms.txt serves AI discoverability — helping engines find the right pages — and complements Product schema, which makes those pages parseable once fetched. It works alongside robots.txt, not instead of it: robots.txt grants or denies access per bot, llms.txt suggests priorities to bots already admitted. Crawler admittance itself is checked in the agent-readability score.

How eCommerce Insights uses it

Presence and freshness of llms.txt are checked in every store scan, and the free llms.txt Generator builds a curated file from a store URL — flagship products first, collections, then policies. The platform flags the file when catalog changes make it stale.

Related terms

AI discoverability — the outcome a curated reading list supports.
Product schema — the structured counterpart once a page is fetched.
Agent-readability score — where crawler-guidance checks live.
GEO (Generative Engine Optimization) — the discipline this file is one tactic within.
AI visibility — the downstream metric llms.txt contributes to.

Ask AI about llms.txt

Have your preferred AI engine summarize this definition for your catalog.

Frequently asked questions

Do AI engines actually read llms.txt?

Unevenly, and without guarantees, as of mid-2026. The convention is proposer-led; several AI vendors have acknowledged it and adoption among sites has grown, but no engine publishes a binding commitment to consume it. The honest case for shipping one is asymmetric cost: a few hundred lines of static text against a meaningful chance of steering retrieval.

What should an ecommerce llms.txt contain?

A brand line and one-paragraph summary, then curated sections: flagship PDPs first, key collections, buying guides, and policies (shipping, returns) that agents factor into recommendations. Keep descriptions front-loaded and factual — they are written for a machine deciding whether to fetch — and keep the file current when flagship URLs change.

Is llms.txt a replacement for robots.txt or sitemap.xml?

No — the three do different jobs. robots.txt grants or denies crawler access; sitemap.xml enumerates URLs for indexing; llms.txt curates and ranks what matters for a context-limited LLM crawler. Ship all three, and make sure they agree: an llms.txt pointing at pages robots.txt blocks helps nobody.

Where does llms.txt live on a Shopify store?

At the domain root — yourstore.com/llms.txt. Shopify does not generate one natively as of mid-2026, so merchants serve it via a theme asset with a redirect, an app, or an edge rule depending on setup. The llms.txt for Shopify guide walks through the options.

Go deeper

llms.txt for Shopify — the guide — format, hosting, and Shopify-specific setup.
llms.txt Generator — generate a curated file from your store URL, free.
Schema for AI search — the companion structured-data work.
AI Agent Lens docs — how crawler signals are evaluated in the product.

See where every product in your catalog stands on this. Start a 14-day free trial — no credit card — or grade one PDP free in 30 seconds.