Learn how to use schema markup and JSON-LD to improve your chances of being cited by ChatGPT, Perplexity, Claude, and Google AI Overviews.
Schema markup has always served one purpose: helping machines understand the meaning of your content. For most of SEO's history, that meant helping Google generate rich results — star ratings, FAQ dropdowns, product prices in search snippets. In 2026, the machine that needs to understand your content is often not Google's indexer but an LLM generating an answer for a user. The principles are the same; the stakes are higher.
This guide covers which schema types matter for AI citations, how to implement them correctly, and the strategic decisions around schema that most sites get wrong.
LLMs that retrieve content at inference time — Perplexity, ChatGPT with browsing, Bing Copilot — parse web pages to extract relevant information. Structured data is a machine-readable signal that removes ambiguity: instead of the crawler having to infer that this block of text is a question and that block is the answer, your schema tells it directly.
For LLMs trained on web crawls, structured data baked into training pages contributes to how the model represents concepts. A page marked up as an Article written by a named Person with a published date carries more semantic weight than an anonymous block of text.
There is also a more direct mechanism: Google AI Overviews draw heavily on structured data to generate answers. The FAQPage schema type, in particular, has a demonstrable correlation with appearing in AI Overviews for question-based queries.
Use JSON-LD. Full stop.
JSON-LD (JavaScript Object Notation for Linked Data) is Google's recommended format. It lives in a <script> tag in your HTML head or body, which means it does not interfere with your visual markup. It is the easiest to implement, easiest to maintain, and most reliably parsed.
Microdata and RDFa require you to annotate your HTML elements directly with attributes — an approach that is fragile, hard to maintain, and increasingly unsupported by the tooling ecosystem.
Every site should have an Organization schema on its homepage. This establishes your brand as a known entity and provides the structured information that feeds Google Knowledge Panels and LLM entity representations.
{
"@context": "https://schema.org",
"@type": "Organization",
"name": "Surfaceable",
"url": "https://surfaceable.io",
"logo": "https://surfaceable.io/logo.png",
"description": "AI visibility and SEO tracking platform for brands.",
"sameAs": [
"https://twitter.com/surfaceableio",
"https://linkedin.com/company/surfaceable",
"https://en.wikipedia.org/wiki/Surfaceable"
]
}
The sameAs property is particularly important — it links your entity to its representations on other platforms, helping LLMs build a complete picture of your brand.
All editorial content should have Article or BlogPosting schema. Include the author as a Person entity with a URL pointing to their author profile or personal site — this establishes authorship as a real entity, supporting E-E-A-T signals.
{
"@context": "https://schema.org",
"@type": "BlogPosting",
"headline": "Schema Markup for AI: JSON-LD Strategies That Get You Cited by LLMs",
"datePublished": "2025-11-11",
"dateModified": "2025-11-11",
"author": {
"@type": "Organization",
"name": "Surfaceable Team",
"url": "https://surfaceable.io"
},
"publisher": {
"@type": "Organization",
"name": "Surfaceable",
"logo": {
"@type": "ImageObject",
"url": "https://surfaceable.io/logo.png"
}
},
"description": "Learn how schema markup improves AI citation rates.",
"mainEntityOfPage": "https://surfaceable.io/blog/schema-markup-for-ai-json-ld-strategies"
}
FAQPage schema is one of the highest-impact markups for AI visibility. It directly maps to how LLMs think about content (question → answer) and is heavily used by Google for AI Overviews.
Apply it to any page that contains question-and-answer content — FAQ pages, product pages with a Q&A section, support articles, and How guides that naturally follow a question structure.
{
"@context": "https://schema.org",
"@type": "FAQPage",
"mainEntity": [
{
"@type": "Question",
"name": "What is schema markup?",
"acceptedAnswer": {
"@type": "Answer",
"text": "Schema markup is structured data added to HTML that helps search engines and AI systems understand the meaning and context of your content."
}
},
{
"@type": "Question",
"name": "Does schema markup help with AI citations?",
"acceptedAnswer": {
"@type": "Answer",
"text": "Yes. Structured data helps LLMs that retrieve web content at query time to understand and extract your content more reliably, increasing the likelihood of citation."
}
}
]
}
For procedural content — guides with numbered steps — use HowTo schema. LLMs are frequently asked step-by-step questions, and content marked up with HowTo has a structural advantage in retrieval systems.
{
"@context": "https://schema.org",
"@type": "HowTo",
"name": "How to implement schema markup",
"step": [
{
"@type": "HowToStep",
"name": "Choose your schema type",
"text": "Identify the most appropriate schema type for your content from schema.org."
},
{
"@type": "HowToStep",
"name": "Write your JSON-LD",
"text": "Create a JSON-LD script block containing your structured data."
},
{
"@type": "HowToStep",
"name": "Add it to your page",
"text": "Place the script tag in the head or body of your HTML."
},
{
"@type": "HowToStep",
"name": "Validate",
"text": "Use Google's Rich Results Test to confirm your markup is valid."
}
]
}
E-commerce sites and SaaS tools should implement Product schema with AggregateRating. LLMs asked to compare products or tools often retrieve product pages, and structured ratings data contextualises your position in the market.
Breadcrumb schema communicates site structure to crawlers and helps AI systems understand the context of a page within your site's hierarchy. It is a low-effort, high-reward implementation.
For SaaS products, SoftwareApplication schema provides relevant context — application category, pricing, operating systems supported, and aggregate ratings.
{
"@context": "https://schema.org",
"@type": "SoftwareApplication",
"name": "Surfaceable",
"applicationCategory": "BusinessApplication",
"operatingSystem": "Web",
"offers": {
"@type": "Offer",
"price": "49",
"priceCurrency": "GBP"
},
"aggregateRating": {
"@type": "AggregateRating",
"ratingValue": "4.8",
"reviewCount": "127"
}
}
Google and most AI parsers require that your structured data accurately represents the visible content. Do not mark up content that does not appear in the page's body — this can result in manual actions from Google and confuses AI parsers that cross-reference markup with visible text.
A page can have multiple schema types in separate <script> tags, but they should not contradict each other. If you have a product page, the Product schema should accurately describe the same product visible on the page.
Each schema type has required and recommended properties. Omitting required properties prevents rich results from triggering and reduces the reliability of the structured data signal. Use Google's Rich Results Test and the Schema.org documentation to verify completeness.
Dates, prices, and availability data go stale. Structured data that is contradicted by the visible page content is worse than no structured data.
For most sites, implement schema in this order:
Organization on homepageWebSite on homepage (enables Sitelinks Searchbox)BreadcrumbList across all pagesArticle/BlogPosting on all content pagesFAQPage on any page with Q&A contentProduct or SoftwareApplication on product pagesHowTo on procedural guidesPerson on author bio pagesAfter implementing schema, monitor:
Schema is a compounding investment. Each correctly implemented schema type adds machine-readable context to your content, making it progressively easier for both search engines and AI systems to understand and cite your site.
Schema markup is one of the most concrete, controllable actions you can take to improve your visibility in AI-generated answers. It is not a magic bullet, but it is a reliable signal — one that AI retrieval systems are specifically designed to use. Implement the core schema types correctly, keep them accurate, and treat structured data as a first-class citizen in your content production workflow. The brands doing this consistently are building a technical advantage that compounds over time.
Try Surfaceable
See how often ChatGPT, Claude, Gemini, and Perplexity mention your brand — and get a full technical SEO audit. Free to start.
Get started free →