AI commerce foundations

AI commerce glossary

A working vocabulary for AI commerce — agentic search, structured data, GTIN, GMC, JSON-LD, and the other terms that come up repeatedly in operator conversations and trade press.

10 min read Updated May 10, 2026

This is the working glossary the rest of the guides on this site assume. Every term here has a narrow, operational meaning — the sense in which it is used in catalog work, not the broader sense that vendor decks tend to reach for. Where the vendor usage and the operational usage diverge, both definitions are given and the divergence is named.

The terms cluster into four working groups:

AI commerce
vocabulary

Surfaces

Mechanics

Catalog data

Identifiers

ChatGPT Shopping · Perplexity · AI Overviews · Copilot

Crawl · Embedding · Candidate set · Citability · Hallucination

Schema.org · JSON-LD · Product schema · Variants · Knowledge graph

GTIN · MPN · Brand · sameAs

The terms are alphabetical. Cross-references inside definitions link to other entries in the same glossary, or to the longer guide where the term gets full treatment.

Agentic shopping

A buyer interaction in which an AI assistant completes a purchase on the buyer’s behalf — booking the trip, placing the order, restocking the pantry — rather than just recommending. Subset of AI commerce in the narrow sense. Adds a trust dimension to ranking: the assistant has to be confident enough to act, not just recommend. Different from “agentic AI” in the general sense, which can mean any autonomous AI behavior outside shopping.

AI commerce

Used two ways. The narrow sense, used throughout these guides: when an AI assistant fulfills a shopping intent on behalf of a buyer by retrieving and citing specific products from a queryable index. The broad sense, common in vendor marketing: anywhere AI shows up in the shopping funnel. The narrow sense is the operationally useful one; the broad sense is too inclusive to plan against. See What AI commerce actually is.

AI Overviews

Google’s generative summaries that appear above the classic search results for some queries. Commercial-intent queries can surface product cards inside the overview. Powered by Google’s existing crawl and Merchant Center pipeline, which means the optimization discipline carries over from classic search SEO and Shopping feed work. Distinct from full-page Gemini results.

AI readiness

The catalog-side discipline of making a product citable by AI assistants — clean schema, complete identifiers, attribute-rich descriptions, accurate availability, voice-matched content. The six base dimensions Lumio scores against are documented in The 6 dimensions of AI readiness.

Brand

In Schema.org terms, a sub-object on a Product that names the manufacturer or owning brand. Best expressed as a Brand or Organization object with name and ideally a url. Important even for private-label products — the brand value identifies the responsible party and is one of the signals AI surfaces use to disambiguate similar products. See Organization schema.

Candidate set

The short list of products retrieved by an AI surface in response to a buyer query, before final ranking. Usually a few dozen products. Getting into the candidate set is a separate problem from ranking well within it. See How AI agents discover and rank products.

ChatGPT Shopping

OpenAI’s in-product shopping experience inside ChatGPT. Returns named product cards inside the conversation, with citations and merchant links. Ingests a product index built from open-web catalogs and the OpenAI merchant program.

Citability

The property of a product (or a catalog) being likely to appear as a named recommendation in an AI assistant’s response. Distinct from ranking — citability is more about whether the surface trusts the data enough to include the product at all, while ranking is the order among the products it does include. The discipline of AI readiness targets citability.

ClaudeBot

Anthropic’s web crawler used to gather training data for Claude. Documented at Anthropic’s bots page. A separate user-agent, Claude-User, fetches pages on behalf of a user inside a Claude session; Claude-SearchBot handles search-related fetches. A robots.txt policy that wants to exclude Claude from a site should disallow all three; disallowing only ClaudeBot blocks training but not in-session fetches.

Crawl budget

The volume of pages a search or AI crawler will fetch from a site in a given window. A large catalog (tens of thousands of products) can exhaust the crawl budget before all pages are visited, leaving some pages stale in the index. Mitigated by sitemap discipline, fast page response times, and reducing low-value crawlable URLs (e.g. faceted search permutations).

Embedding

A vector representation of a chunk of content — a product title, a description, a query. AI surfaces embed both catalog content and buyer queries into the same vector space; retrieval is roughly “find the catalog embeddings nearest the query embedding.” Why description specificity matters: a generic description embeds to a generic location and matches a generic set of queries, not the specific ones a buyer in a particular need-state would issue.

Feed (product feed)

A structured data file (CSV, XML, or API) describing the catalog’s products, delivered to a feed surface like Google Merchant Center or Microsoft Merchant Center. The feed is a parallel input alongside on-page structured markup; surfaces that ingest both typically merge them, with the feed treated as authoritative on price and availability and the page treated as authoritative on content and identifiers.

Generative engine optimization (GEO)

A label some vendors use for the work of optimizing for AI surfaces. Functionally overlaps heavily with AI readiness; the two terms are largely interchangeable in operator usage. GEO emphasizes the generative side; AI readiness emphasizes the catalog-side preparation.

GMC (Google Merchant Center)

Google Merchant Center is the feed surface where merchants upload product data for Google Shopping, Google AI Overviews, and Gemini’s commercial queries. Feed integrity (low error rate, complete attributes, recent updates) is a major input to ranking on these surfaces. See Google Merchant Center setup.

GPTBot

OpenAI’s web crawler. Documented at OpenAI’s bots page. Fetches public web content for use in OpenAI’s training and retrieval surfaces, including ChatGPT Search and ChatGPT Shopping. A robots.txt disallow on GPTBot removes the catalog from OpenAI’s discovery path.

GTIN

Global Trade Item Number — the universal identifier (UPC, EAN, ISBN, JAN, etc.) that identifies a product across merchants. Encoded in Schema.org as gtin13, gtin12, gtin8, gtin14, or the generic gtin property. AI surfaces use GTIN to deduplicate the same product across resellers and to identify the canonical source. See GTINs, MPNs, and brand identifiers.

Hallucination

In AI commerce, the failure mode where an assistant returns a product that does not exist (or returns false properties of a real product — wrong price, wrong availability, wrong material). The narrow definition of AI commerce excludes pure hallucination — a surface that hallucinates products instead of retrieving them is generative, not commerce.

JSON-LD

JSON for Linking Data — the serialization format Google, Microsoft, OpenAI, Perplexity, and others prefer for Schema.org markup. Lives in a <script type="application/ld+json"> block inside the page HTML. Distinct from microdata and RDFa, which embed structured data into HTML elements directly. See JSON-LD vs. microdata vs. RDFa.

Knowledge graph

An entity-and-relationship store that an AI surface or search engine uses to disambiguate products, brands, and concepts. Google’s Knowledge Graph is the canonical example. GTINs, brand @id values, and sameAs links help an AI surface connect a product page to its knowledge-graph entity.

LLM (large language model)

The underlying model class powering AI commerce surfaces — ChatGPT (GPT-4 and successors), Claude, Gemini, and Perplexity’s mix of models. The LLM is the front-end agent that interprets the buyer query and assembles the response; the back-end retrieval system is what actually pulls the candidate products from the index.

Microsoft Merchant Center

Microsoft’s feed surface, the analog of GMC for Bing Shopping and Microsoft Copilot Shopping. Similar feed format to GMC; the main practical difference is the destination index.

MPN

Manufacturer Part Number — the manufacturer’s internal identifier for a part or product. Encoded in Schema.org as mpn. Combined with brand, an MPN can serve as an identifier when a GTIN is unavailable. Useful for B2B catalogs, electronics, and configurable products. See GTINs, MPNs, and brand identifiers.

Offers

In Schema.org terms, the Offer object attached to a Product that carries price, availability, currency, and related commercial properties. AI surfaces lean on Offer for filtering (price range queries, availability queries) and for flagging stale data. See Offer schema, pricing, availability, inventory.

PerplexityBot

Perplexity’s web crawler. Documented at Perplexity’s bots guide. A robots.txt disallow on PerplexityBot removes the catalog from Perplexity’s discovery path.

Perplexity Shopping

Perplexity’s shopping experience, surfaced inside their general assistant. Returns product cards inline with answers. Treats merchant pages with structured markup as the canonical product reference.

Product feed

See feed.

Product schema

The Schema.org Product type and its associated properties — name, description, image, offers, brand, gtin13, mpn, material, etc. The structured-data layer AI surfaces read to extract product information from a page. Distinct from microdata or RDFa — see JSON-LD. Full property treatment for the most common platform is in Product schema for Shopify.

Rich results

Google’s term for search results that include extra visual elements (star ratings, prices, availability badges) derived from structured markup. A page that passes the Rich Results Test is eligible for these features. Eligibility is necessary but not sufficient for AI Overviews placement.

Schema.org

The shared vocabulary for structured data on the web. Maintained by a working group that includes Google, Microsoft, Yahoo, and Yandex. The Product, Offer, Review, AggregateRating, Organization, BreadcrumbList, and FAQPage types are the most commonly used for AI commerce work.

Schema validity

The property of a page’s structured-data markup parsing cleanly against the Schema.org specification — required properties present, types matching, enums in their allowed value set. Tools: Schema.org validator, Google Rich Results Test, Schema Markup Validator. Validity is a floor; completeness is the ceiling. See Validating structured data.

Schema completeness

The fraction of recommended properties present and populated for a given Schema.org type. A Product can validate with only name and offers.price; completeness asks how many of the other useful properties (gtin13, brand, material, color, size, image, description) are also present. AI surfaces lean on completeness heavily.

Structured data

Catch-all term for machine-parseable data on a page — typically JSON-LD with Schema.org vocabulary. The “structured” part contrasts with prose: a description in prose says “a 14k gold ring”; structured data says {"material": "14k gold", "@type": "Product"}. Both can convey the same fact; the structured form is what AI surfaces filter on.

Variants

Products that share a parent (a sweater that comes in multiple sizes and colors) but differ in specific attributes. Schema.org models these with ProductGroup, hasVariant, and isVariantOf. Variant handling is one of the more error-prone parts of product schema — done wrong, variants either get collapsed into one product or fragmented into separate products the surface can’t relate. See Variant handling in product schema.

Voice rules

Lumio’s mechanism for capturing brand voice guidelines that the scoring and enrichment systems use to evaluate product content. When voice rules are populated, the AI Readiness Score adds a seventh dimension (Brand alignment) measuring how well the content matches them. See The 6 dimensions of AI readiness.

Where to read more