Guide · Research · Updated June 2026

AI keyword research for D2C: citability over volume

Buyers type to AI engines differently than they type to Google: full sentences, stacked qualifiers, budgets stated naturally, follow-up turns that shift intent mid-conversation. This guide is a practical research method for D2C brands planning content and PDP work around those differences — built on prompt libraries and citation research rather than volume scores.

eCommerce Insights team · 9 min read

What is different about AI keyword research

Classical keyword research built lists of two-to-four-word queries with clean volume and difficulty scores, against a surface of ten blue links. AI research works against a different surface: conversational input, generative answers, selective citations. Queries are longer. Qualifiers stack. Volume data is opaque. The research job changes from "which queries have volume and rank potential" to "which queries are being answered in AI, which sources get cited, and where does the catalog have a claim." The overlap with classical SEO is large but not total — intent research and topic clustering still pay; the new layer is citation research. The broader framework is in what is GEO.

How buyers actually phrase shopping queries to AI

Observable patterns from public AI chat interfaces as of mid-2026: buyers write full sentences with context ("I'm doing a long weekend of backcountry skiing in late February, I run hot, what's the best merino base layer under $120?"). They state budgets naturally. They mention use cases, body types, seasons, constraints. They ask follow-ups that reference earlier turns, and they often ask the engine to compare two or three options rather than name one winner. This is a conversation with a knowledgeable salesperson, not a keyword search — and PDPs written with that shape in mind produce passages engines can lift directly into answers, per optimize content for AI search.

Stacked qualifiers

"Best merino base layer for backcountry skiing under $120" packs product type, material, use case, and price into one string. On Google, the same buyer would type "merino base layer" and filter manually; to an AI engine, they hand the whole stack at once because the engine handles it. For a D2C catalog this is opportunity: the long-tail shape means less competition, and the product matching the full qualifier stack tends to win the citation. A PDP that states price tier, material, and use case in plain-language passages — and in additionalProperty structured data, per the schema guide — has a structural advantage on exactly these queries.

Intent drift in AI conversations

A single conversation often moves through three intents: discovery ("what's the best merino base layer under $120?"), specification ("what weight should I get for late winter?"), logistics ("which of those ships in time for next weekend?"). Each turn can surface a different cited-source set. A flat keyword list undersells this; mapping conversation arcs — discovery, comparison, specification, logistics, purchase, post-purchase — lets a content team plan pages that can be cited at more than one turn.

Keyword lists optimize for the first thing a buyer types. Conversation arcs optimize for the third thing they ask — which is often where a citation decides the purchase.

Brand-less queries: the neutral-intent majority

A large share of AI shopping queries contain no brand name. The engine decides which brands surface — and observed behavior across ChatGPT, Perplexity, and Google AI Mode as of mid-2026 puts three to seven cited sources on a typical neutral-intent answer, blending retailer PDPs, independent reviews, and editorial buying guides. Winning brand-name queries is defensive; winning neutral-intent queries grows share of answer against competitors the buyer has not named yet. A good research plan prioritizes neutral-intent queries in every category the brand wants to win — they are the queries that feed the share-of-answer outcome in a GEO strategy.

The research workflow

Mine real language. CX transcripts, support tickets, on-site search logs, and review text are records of how buyers actually phrase needs.
Ask the engines. "What are the ten most common questions a buyer asks before purchasing a merino base layer?" produces usable starter lists from ChatGPT and Perplexity — treated skeptically and cross-checked.
Cross-check with classical tools. Volume and intent data from the SEO stack confirms which clusters carry demand.
Build per-category prompt libraries. Twenty to fifty conversational queries per category, tagged by intent: discovery, comparison, specification, logistics.
Run citation research. For each priority query: which engines answer it, which sources are cited, and whether your PDPs or guides are among them. This is the layer classical tools do not cover.

The output feeds two systems: the content plan, and the tracking configuration — the same prompt libraries become the weekly monitoring battery in SKU-level tracking, which records per-SKU citations against each query over time.

Tools, honestly assessed

As of mid-2026: classical SEO tools (Semrush, Ahrefs, SE Ranking, Google Search Console) remain useful for volume, intent, and SERP context — none of them shows which sources an AI answer cited for a conversational query. AI visibility platforms cover that layer at different altitudes: brand-level tools report mention share against prompt sets; eCommerce Insights resolves citations to SKUs and links each gap to a PDP fix. For quick manual checks, the engines themselves plus a disciplined spreadsheet work at small scale — the math stops working past a few dozen SKUs, which is the point of automation. The single-product version is free: the ChatGPT product visibility checker runs real intent queries against one product, and the measurement framework behind all of it is the product AI visibility pillar. For published evidence that query phrasing changes citation outcomes, see the GEO benchmark paper (KDD 2024).

Questions researchers ask

How is AI keyword research different from classical keyword research?

Classical research built short keyword lists with volume and difficulty scores against a ten-blue-links surface. AI research works against conversational input, generative answers, and selective citations: queries are longer, qualifiers stack, intent drifts across a conversation, and volume data is opaque. The job becomes finding which queries are answered in AI, which sources get cited, and where the catalog has a claim.

What are stacked-qualifier queries?

Queries that pack product type, material, use case, budget, and constraints into one string — "best merino base layer for backcountry skiing under $120." Buyers type them naturally to AI engines because the engines handle them. They are an opportunity for D2C: less competition, and the product matching the full stack tends to win the citation.

Do classical SEO tools still help with AI keyword research?

Yes, as one of three inputs. Semrush, Ahrefs, SE Ranking, and Search Console remain useful for volume, intent, and topic clustering. The new layer they do not cover is citation research — for each priority query, which engines answer it, which sources they cite, and whether your pages are among them. That layer requires querying the engines directly or using a tracking tool.

How many queries should a D2C brand track per category?

Twenty to fifty conversational queries per category, tagged by intent — discovery, comparison, specification, logistics. That is enough to cover the realistic arc of a buying conversation without drowning the weekly review. Prioritize neutral-intent (brand-less) queries; they are where share of answer is won or lost against competitors the buyer has not named.

From library to ledger

Turn your prompt library into weekly tracking.

Load your queries, connect the catalog, and watch per-SKU citations move query by query.

Start free trial Check one product free