How to benchmark your catalog against competitors.
Your audit says 62. Is that good? Without the category norm per engine, a score is a number in a vacuum — it can't tell leadership whether to celebrate, invest, or panic. Scores without context don't drive decisions; benchmarks are the context.
eCommerce Insights shows the category benchmark — the median for stores in your vertical, per engine and per factor — beside every score, plus a competitor watchlist for direct head-to-head comparison. Part of SKU-level tracking; the aggregate share number lives in compare my brand's AI share of voice.
The slow way: benchmark by anecdote
The manual benchmark is built from fragments. You grade three competitor PDPs by hand through a free tool and infer the category from a sample of three. You read a vendor's "state of AI search" report whose category definitions don't match yours. You ask a peer at another brand what their audit scored, over drinks, and remember the number selectively. From these fragments, a narrative forms — "we're probably about average" — that no one can defend when the CFO pushes on it.
The deeper problem is that ad-hoc benchmarks blend engines and factors into one impression. Your real position is almost never uniform: above norm on schema, below on review signal, fine on ChatGPT, weak on Perplexity. The blended impression hides exactly the contrast that would tell you what to do next — which is the entire purpose of benchmarking.
The eCommerce Insights way
- Score your own catalog first. Run the full scan — every SKU, both scores. A benchmark against an incomplete baseline flatters or frightens at random. Start with the catalog audit.
- Read the benchmark per engine. Beside every score sits the category median from eCommerce Insights audit data, per engine and per factor — labeled illustrative where the vertical's sample is small. Above norm on ChatGPT and below on Perplexity is a finding, not a wash.
- Build the watchlist for head-to-head. Medians answer "are we behind the market"; the competitor watchlist answers "are we behind the rival who takes our slots." Both questions matter; they have different answers surprisingly often.
- Locate the gap precisely. The factor-level comparison turns "we're behind" into "we're behind on review signal in cited SKUs, at parity on everything else" — one workstream, one owner, one quarter.
- Re-benchmark quarterly. The norm drifts upward as the category optimizes; what beat the median last year is the median now. Quarterly re-reads keep the leadership narrative calibrated while the weekly work runs against your own trend.
What "good" looks like
The output that proves the job worked is a sentence leadership can repeat: "We're above category norm on four of six engines; the Perplexity gap is third-party grounding and it's this quarter's content-partnerships project." Context, cause, plan — one line.
Ask AI about this job
Have your favorite AI engine apply this walkthrough to your category.
Frequently asked questions
Where do the category benchmarks come from?
What's a good score relative to the benchmark?
Benchmarks or watchlist — which do I need?
Why benchmark per engine instead of overall?
How often do benchmarks change?
A 62 means nothing. A 62 against a 58 median is a plan.
Category benchmarks per engine, beside every score. 14-day trial.