AI commerce foundations

The 6 dimensions of AI readiness

Lumio scores every product on six dimensions of AI discoverability — and a seventh when brand voice rules are set. Here is what each dimension measures, how scoring works, and how to read a scorecard.

10 min read Updated May 1, 2026

The AI Readiness Score is a per-product number from 0–100 that measures how well a product’s data is structured for AI shopping agent discovery. It’s an AI-judged evaluation — Lumio reads each product’s titles, descriptions, structured markup, and identifiers through Anthropic’s Message Batches API, scores six dimensions independently, and rolls them up into the overall number.

Six dimensions cover the base case. A seventh — Brand alignment — joins when the workspace has voice rules or a brand profile populated; when it does, the other six rebalance to make room.

The base weighting:

20%20%20%15%15%10%AI Readiness Score — base dimension weightsTitle qualityDescription densityConversational fieldsIdentifier coverageSchema completenessAvailability precision

This guide is the operator’s read on what each dimension measures, how the scorecard reflects catalog state, and how to use the score to decide what to fix first.

A real catalog’s scorecard is the actionable artifact. The number at the top is the lagging signal; the dimensional breakdown is what gets on the Monday morning task list:

A sample readiness scorecardA scorecard view of a sample catalog showing scores on each of the six dimensions of the AI Readiness Score, with an overall rolled-up score on the right and the lowest-scoring dimension flagged for next action.SAMPLE CATALOG · APPARELAcme Outfitters readinessIdentifier coverage15%78Title quality20%65Description density20%52Conversational fields20%19Availability precision10%88Schema completeness15%70OVERALL58/ 100Score rangeFair (40–69)Fix first: conversational fields (19) — biggest dimension at the lowest score

How scoring works

Scoring runs as a background job through Anthropic’s Message Batches API for cost-efficient bulk processing. Products batch at 500 per request. The scoring model reads the product’s raw data — titles, descriptions, JSON-LD, meta tags — and evaluates it against each dimension’s criteria.

The brand profile (vertical, brand adjectives, customer persona) is included as context. A hiking boot is evaluated differently than a lipstick; the attributes that matter are vertical-specific.

Each dimension scores 0–100 independently. The overall score is the weighted average across the active dimensions.

The six base dimensions

1. Identifier coverage — 15%

What it measures: GTIN, MPN, brand, and model number presence and quality. The identifiers that help AI agents match products across sources and treat the product as authoritative rather than speculative.

Why it matters: AI agents serving branded queries weight identifiers heavily. A product without a GTIN competes with its own resellers and loses; a product with one is the canonical version the agent cites.

Low-score signals: missing GTIN/UPC, generic SKU as the only identifier, missing brand on private-label products, no model number on configurable items.

2. Title quality — 20%

What it measures: structured title format. Brand, product type, defining attribute, variant — not marketing slogans, not keyword-stuffed strings, not bare model names.

Why it matters: titles are the highest-weight text field across AI agents. A title that follows the structural pattern surfaces in shopping-intent queries; a title that’s marketing copy surfaces in marketing-intent queries (the agent matches what gets the embedding match).

Low-score signals: marketing-led titles (“The Best Sweater You’ll Ever Own”), keyword-stuffed titles, bare model names, all-caps, promotional decoration.

3. Description density — 20%

What it measures: attribute-rich content that answers the implicit questions buyers ask — materials, dimensions, use cases, compatibility, certifications. Not how long the description is; how much actionable structured information it carries.

Why it matters: descriptions are the AI agent’s primary source for constraint-intent queries (“wool sweater under $200 for cold weather”). Generic descriptions don’t match these queries; specific descriptions do.

Low-score signals: marketing-only prose, missing material/ dimension/use-case data, descriptions that read identically across multiple products.

4. Conversational fields — 20%

What it measures: Q&A pairs, usage scenarios, and compatibility notes that match how shoppers query AI assistants. The explicit-question content that shoppers explicitly look for before buying.

Why it matters: AI agents handling pre-purchase questions (“does this run true to size”, “is this compatible with X”) cite conversational content directly when it’s structured. Catalogs without it lose those queries entirely.

Low-score signals: no FAQ content, generic shipping/returns boilerplate (vs. genuine product-specific Q&A), no use-case walkthroughs.

5. Availability precision — 10%

What it measures: exact quantity, handling time, and replenishment date. Beyond binary in-stock / out-of-stock, the precision that lets AI agents match products to delivery-intent queries.

Why it matters: “in stock and ships today” surfaces differently from “in stock” alone. Pre-orders and back-orders that signal ship dates surface differently from those that don’t.

Low-score signals: binary availability with no quantity, no handling time, no replenishment data on out-of-stock products.

6. Schema completeness — 15%

What it measures: Product, Offer, and Review JSON-LD markup quality. Required and recommended properties present and parseable.

Why it matters: this is the structured-data layer AI agents read through Google’s index, ChatGPT’s product index, Bing Shopping, and other index-driven surfaces. Schema isn’t read by AI agents on direct page fetch — it’s read by the indexes those agents query.

Low-score signals: missing core properties (offers, brand, identifiers), malformed availability strings, duplicate Product blocks, stale priceValidUntil dates.

The implementation reference for schema completeness is Product schema for Shopify.

The conditional 7th dimension: Brand alignment — 14%

When the workspace has voice rules or a brand profile populated, Brand alignment joins the scoring mix. It measures how well the product data matches the workspace’s voice rules and brand profile — vertical, brand adjectives, customer persona.

When Brand alignment activates, the other six dimensions rebalance to make room. The score still totals 100; the relative weights shift.

Why it matters when active: a catalog with strong attributes but voice that doesn’t match the brand reads as inconsistent to AI agents — the surfacing weakens because the agent can’t confidently attribute the product to the brand identity it associates with the domain.

Score ranges

Three bands with operational meaning:

Schema Health vs. AI Readiness Score

Two reads on the same catalog, measuring different things:

Both are useful — Schema Health catches mechanical issues (missing GTIN, malformed offers, duplicate Product blocks) instantly. The AI Readiness Score asks the harder semantic question.

A common pattern: a catalog that passes Schema Health (markup is valid) still scores poorly on AI Readiness (markup is valid but semantically thin). Schema validation is the floor; semantic quality is what surfaces.

How dimensions interact

The dimensions are independent at the scoring layer (each is evaluated separately) but related operationally. Three patterns worth noting:

Conversational fields and Description density share content. Both dimensions read prose content on the product, just with different evaluators. A catalog that adds genuine FAQ pairs lifts both dimensions simultaneously.

Schema completeness multiplies what other dimensions can do. Title quality, description density, and identifiers all live inside the schema layer when properly structured. A high-quality title not exposed in JSON-LD reaches AI agents through fewer paths than the same title rendered into structured markup.

Brand alignment caps the other dimensions when it’s low. When voice rules are set and the catalog scores poorly on Brand alignment, the other dimensions matter less — the catalog reads as inauthentic to the brand identity, and AI agents discount accordingly.

What to do with a score

The number on its own is the wrong artifact to act on. The actionable artifact is the dimensional breakdown plus the gap report Lumio generates for any dimension scoring below 70.

A workflow:

  1. Identify the lowest-scoring dimension. The marginal readiness gain per hour of work is highest at the lowest score.
  2. Read the gap report for that dimension. It names specific issues (“No GTIN identifier found”, “Title is generic and lacks key attributes”) and actionable suggestions.
  3. Run enrichment to fix the named gaps automatically, OR fix manually if the catalog has data Lumio can’t infer (custom measurements, brand-specific use cases).
  4. Re-score after changes propagate. Compare the dimension scores pre- and post-fix.

Where it breaks