Concepts

Structured Data

Definition

Structured data is information organized in a standardized, machine-readable format that enables search engines, AI agents, and other automated systems to understand the content of a web page without ambiguity. In ecommerce, structured data describes products - their names, prices, availability, reviews, and attributes - in a way that machines can parse reliably.

Without structured data, a machine reading a product page sees raw HTML - a mix of navigation, marketing copy, images, and product details all blended together. With structured data, the same machine gets a clean, labeled set of facts: this is the product name, this is the price, this is the availability status, these are the reviews.

The most common implementation of structured data in ecommerce uses Schema.org vocabulary encoded as JSON-LD. But structured data is a broader concept that encompasses any format designed for machine consumption: XML product feeds, CSV exports, API responses, and protocol-based data access through MCP or ACP.

Why It Matters

Structured data has always mattered for SEO. It is now becoming critical for AI-driven commerce.

Rich search results. Google uses structured data to generate rich snippets - the enhanced search listings showing star ratings, prices, and availability badges. Products with rich snippets earn significantly higher click-through rates than plain text listings. This alone makes structured data implementation worthwhile.

AI agent access. When AI shopping agents process product pages, structured data is their primary source of reliable information. An AI agent can read JSON-LD markup with near-perfect accuracy. Parsing the same information from unstructured HTML requires heuristics that often fail - especially for prices, variants, and availability.

Protocol foundations. The agentic commerce protocols (MCP, ACP, UCP) all deliver structured data. When an AI agent queries a store through ACP, it receives structured product data. Merchants who already maintain clean structured data on their web pages have an easier time ensuring their protocol feeds are equally complete.

Competitive differentiation. Most stores have some structured data, but few have complete structured data. The average ecommerce product page marks up name, price, and maybe availability. Merchants who also mark up brand, SKU, dimensions, materials, color, reviews, FAQ content, and breadcrumb navigation create a significantly richer machine-readable profile that AI systems prefer.

How It Works

Structured data in ecommerce operates at several levels:

On-page markup is the most common form. JSON-LD blocks embedded in product page HTML describe the product using Schema.org vocabulary. The markup sits in a <script> tag, invisible to human visitors but readable by any machine that processes the page. Most ecommerce platforms generate basic product markup automatically, but the default is often incomplete.

Product feeds are structured data files (typically XML or CSV) submitted to external platforms like Google Merchant Center and Facebook Commerce - the same information as on-page markup but in a bulk format designed for platform ingestion.

API and protocol endpoints deliver structured data on demand. When an AI agent queries a store through ACP or MCP, it receives structured product data in response to specific requests - the most AI-native form of structured data.

Key product attributes to structure:

  • Product name, description, and brand
  • Price and currency (including sale prices)
  • Availability status (in stock, out of stock, pre-order)
  • Product images (multiple angles preferred)
  • SKU and product identifiers (GTIN, MPN)
  • Variants (size, color, material) as separate offers
  • Customer reviews and aggregate ratings
  • Category and breadcrumb navigation
  • Shipping and return information

The gap between what merchants know about their products and what they expose as structured data is typically large. A merchant might track 50 attributes per product internally but only expose 5 in their structured data. Every unexposed attribute is a missed signal for search engines and AI agents. Auditing this gap using Google’s Rich Results Test or Schema.org validators is the practical starting point.

  • JSON-LD - The most common format for implementing structured data on web pages
  • Schema.org - The vocabulary standard that defines structured data types and properties
  • Product Feed - Bulk structured data exports for external platforms
  • AI Readiness - The broader assessment where structured data is a core component
  • Generative Engine Optimization (GEO) - Optimization practices that rely heavily on structured data

Stay ahead on agentic commerce

New research, experiments, and insights on how AI agents are reshaping e-commerce. No spam, just signal.