This is the working glossary the rest of the guides on this site assume. Every term here has a narrow, operational meaning — the sense in which it is used in catalog work, not the broader sense that vendor decks tend to reach for. Where the vendor usage and the operational usage diverge, both definitions are given and the divergence is named.
The terms cluster into four working groups:
The terms are alphabetical. Cross-references inside definitions link to other entries in the same glossary, or to the longer guide where the term gets full treatment.
Agentic shopping
A buyer interaction in which an AI assistant completes a purchase on the buyer’s behalf — booking the trip, placing the order, restocking the pantry — rather than just recommending. Subset of AI commerce in the narrow sense. Adds a trust dimension to ranking: the assistant has to be confident enough to act, not just recommend. Different from “agentic AI” in the general sense, which can mean any autonomous AI behavior outside shopping.
AI commerce
Used two ways. The narrow sense, used throughout these guides: when an AI assistant fulfills a shopping intent on behalf of a buyer by retrieving and citing specific products from a queryable index. The broad sense, common in vendor marketing: anywhere AI shows up in the shopping funnel. The narrow sense is the operationally useful one; the broad sense is too inclusive to plan against. See What AI commerce actually is.
AI Overviews
Google’s generative summaries that appear above the classic search results for some queries. Commercial-intent queries can surface product cards inside the overview. Powered by Google’s existing crawl and Merchant Center pipeline, which means the optimization discipline carries over from classic search SEO and Shopping feed work. Distinct from full-page Gemini results.
AI readiness
The catalog-side discipline of making a product citable by AI assistants — clean schema, complete identifiers, attribute-rich descriptions, accurate availability, voice-matched content. The six base dimensions Lumio scores against are documented in The 6 dimensions of AI readiness.
Brand
In Schema.org terms, a sub-object
on a Product that names the manufacturer or owning brand. Best
expressed as a Brand or Organization object with name
and ideally a url. Important even for private-label products
— the brand value identifies the responsible party and is one of
the signals AI surfaces use to disambiguate similar products. See
Organization schema.
Candidate set
The short list of products retrieved by an AI surface in response to a buyer query, before final ranking. Usually a few dozen products. Getting into the candidate set is a separate problem from ranking well within it. See How AI agents discover and rank products.
ChatGPT Shopping
OpenAI’s in-product shopping experience inside ChatGPT. Returns named product cards inside the conversation, with citations and merchant links. Ingests a product index built from open-web catalogs and the OpenAI merchant program.
Citability
The property of a product (or a catalog) being likely to appear as a named recommendation in an AI assistant’s response. Distinct from ranking — citability is more about whether the surface trusts the data enough to include the product at all, while ranking is the order among the products it does include. The discipline of AI readiness targets citability.
ClaudeBot
Anthropic’s web crawler used to gather training data for Claude.
Documented at Anthropic’s bots
page.
A separate user-agent, Claude-User, fetches pages on behalf of
a user inside a Claude session; Claude-SearchBot handles
search-related fetches. A robots.txt policy that wants to
exclude Claude from a site should disallow all three; disallowing
only ClaudeBot blocks training but not in-session fetches.
Crawl budget
The volume of pages a search or AI crawler will fetch from a site in a given window. A large catalog (tens of thousands of products) can exhaust the crawl budget before all pages are visited, leaving some pages stale in the index. Mitigated by sitemap discipline, fast page response times, and reducing low-value crawlable URLs (e.g. faceted search permutations).
Embedding
A vector representation of a chunk of content — a product title, a description, a query. AI surfaces embed both catalog content and buyer queries into the same vector space; retrieval is roughly “find the catalog embeddings nearest the query embedding.” Why description specificity matters: a generic description embeds to a generic location and matches a generic set of queries, not the specific ones a buyer in a particular need-state would issue.
Feed (product feed)
A structured data file (CSV, XML, or API) describing the catalog’s products, delivered to a feed surface like Google Merchant Center or Microsoft Merchant Center. The feed is a parallel input alongside on-page structured markup; surfaces that ingest both typically merge them, with the feed treated as authoritative on price and availability and the page treated as authoritative on content and identifiers.
Generative engine optimization (GEO)
A label some vendors use for the work of optimizing for AI surfaces. Functionally overlaps heavily with AI readiness; the two terms are largely interchangeable in operator usage. GEO emphasizes the generative side; AI readiness emphasizes the catalog-side preparation.
GMC (Google Merchant Center)
Google Merchant Center is the feed surface where merchants upload product data for Google Shopping, Google AI Overviews, and Gemini’s commercial queries. Feed integrity (low error rate, complete attributes, recent updates) is a major input to ranking on these surfaces. See Google Merchant Center setup.
GPTBot
OpenAI’s web crawler. Documented at
OpenAI’s bots page.
Fetches public web content for use in OpenAI’s training and
retrieval surfaces, including ChatGPT Search and ChatGPT
Shopping. A robots.txt disallow on GPTBot removes the catalog
from OpenAI’s discovery path.
GTIN
Global Trade Item Number
— the universal identifier (UPC, EAN, ISBN, JAN, etc.) that
identifies a product across merchants. Encoded in Schema.org as
gtin13,
gtin12,
gtin8,
gtin14, or the generic
gtin property. AI surfaces use GTIN
to deduplicate the same product across resellers and to identify
the canonical source. See GTINs, MPNs, and brand
identifiers.
Hallucination
In AI commerce, the failure mode where an assistant returns a product that does not exist (or returns false properties of a real product — wrong price, wrong availability, wrong material). The narrow definition of AI commerce excludes pure hallucination — a surface that hallucinates products instead of retrieving them is generative, not commerce.
JSON-LD
JSON for Linking Data — the serialization
format Google, Microsoft, OpenAI, Perplexity, and others prefer
for Schema.org markup. Lives in a
<script type="application/ld+json"> block inside the page
HTML. Distinct from microdata and RDFa, which embed structured
data into HTML elements directly. See JSON-LD vs. microdata vs.
RDFa.
Knowledge graph
An entity-and-relationship store that an AI surface or search
engine uses to disambiguate products, brands, and concepts.
Google’s Knowledge
Graph is the
canonical example. GTINs, brand @id values, and sameAs links
help an AI surface connect a product page to its knowledge-graph
entity.
LLM (large language model)
The underlying model class powering AI commerce surfaces — ChatGPT (GPT-4 and successors), Claude, Gemini, and Perplexity’s mix of models. The LLM is the front-end agent that interprets the buyer query and assembles the response; the back-end retrieval system is what actually pulls the candidate products from the index.
Microsoft Merchant Center
Microsoft’s feed surface, the analog of GMC for Bing Shopping and Microsoft Copilot Shopping. Similar feed format to GMC; the main practical difference is the destination index.
MPN
Manufacturer Part
Number — the
manufacturer’s internal identifier for a part or product. Encoded
in Schema.org as mpn. Combined with brand, an MPN can serve
as an identifier when a GTIN is unavailable. Useful for
B2B catalogs, electronics, and configurable products. See GTINs,
MPNs, and brand identifiers.
Offers
In Schema.org terms, the
Offer object attached to a
Product that carries price, availability, currency, and
related commercial properties. AI surfaces lean on Offer for
filtering (price range queries, availability queries) and for
flagging stale data. See Offer schema, pricing, availability,
inventory.
PerplexityBot
Perplexity’s web crawler. Documented at
Perplexity’s bots
guide. A robots.txt
disallow on PerplexityBot removes the catalog from Perplexity’s
discovery path.
Perplexity Shopping
Perplexity’s shopping experience, surfaced inside their general assistant. Returns product cards inline with answers. Treats merchant pages with structured markup as the canonical product reference.
Product feed
See feed.
Product schema
The Schema.org Product type and
its associated properties — name, description, image,
offers, brand, gtin13, mpn, material, etc. The
structured-data layer AI surfaces read to extract product
information from a page. Distinct from microdata or RDFa — see
JSON-LD. Full property treatment for the most common
platform is in Product schema for
Shopify.
Rich results
Google’s term for search results that include extra visual elements (star ratings, prices, availability badges) derived from structured markup. A page that passes the Rich Results Test is eligible for these features. Eligibility is necessary but not sufficient for AI Overviews placement.
Schema.org
The shared vocabulary for structured data
on the web. Maintained by a working group that includes Google,
Microsoft, Yahoo, and Yandex. The Product, Offer, Review,
AggregateRating, Organization, BreadcrumbList, and
FAQPage types are the most commonly used for AI commerce work.
Schema validity
The property of a page’s structured-data markup parsing cleanly against the Schema.org specification — required properties present, types matching, enums in their allowed value set. Tools: Schema.org validator, Google Rich Results Test, Schema Markup Validator. Validity is a floor; completeness is the ceiling. See Validating structured data.
Schema completeness
The fraction of recommended properties present and populated for
a given Schema.org type. A Product can validate with only
name and offers.price; completeness asks how many of the
other useful properties (gtin13, brand, material,
color, size, image, description) are also present.
AI surfaces lean on completeness heavily.
Structured data
Catch-all term for machine-parseable data on a page — typically
JSON-LD with Schema.org vocabulary.
The “structured” part contrasts with prose: a description in
prose says “a 14k gold ring”; structured data says
{"material": "14k gold", "@type": "Product"}. Both can convey
the same fact; the structured form is what AI surfaces filter on.
Variants
Products that share a parent (a sweater that comes in multiple
sizes and colors) but differ in specific attributes. Schema.org
models these with ProductGroup, hasVariant, and
isVariantOf. Variant handling is one of the more error-prone
parts of product schema — done wrong, variants either get
collapsed into one product or fragmented into separate products
the surface can’t relate. See Variant handling in product
schema.
Voice rules
Lumio’s mechanism for capturing brand voice guidelines that the scoring and enrichment systems use to evaluate product content. When voice rules are populated, the AI Readiness Score adds a seventh dimension (Brand alignment) measuring how well the content matches them. See The 6 dimensions of AI readiness.
Where to read more
- What AI commerce actually is — the definitional foundation.
- How AI agents discover and rank products — the mechanics of the four-stage pipeline.
- The 6 dimensions of AI readiness — Lumio’s framework for scoring citability.