Technical SEO·7 min read

Schema Markup for AI: JSON-LD Strategies That Get You Cited by LLMs

Learn how to use schema markup and JSON-LD to improve your chances of being cited by ChatGPT, Perplexity, Claude, and Google AI Overviews.


Schema markup has always served one purpose: helping machines understand the meaning of your content. For most of SEO's history, that meant helping Google generate rich results — star ratings, FAQ dropdowns, product prices in search snippets. In 2026, the machine that needs to understand your content is often not Google's indexer but an LLM generating an answer for a user. The principles are the same; the stakes are higher.

This guide covers which schema types matter for AI citations, how to implement them correctly, and the strategic decisions around schema that most sites get wrong.

Why Schema Matters for LLM Citations

LLMs that retrieve content at inference time — Perplexity, ChatGPT with browsing, Bing Copilot — parse web pages to extract relevant information. Structured data is a machine-readable signal that removes ambiguity: instead of the crawler having to infer that this block of text is a question and that block is the answer, your schema tells it directly.

For LLMs trained on web crawls, structured data baked into training pages contributes to how the model represents concepts. A page marked up as an Article written by a named Person with a published date carries more semantic weight than an anonymous block of text.

There is also a more direct mechanism: Google AI Overviews draw heavily on structured data to generate answers. The FAQPage schema type, in particular, has a demonstrable correlation with appearing in AI Overviews for question-based queries.

The Formats: JSON-LD vs Microdata vs RDFa

Use JSON-LD. Full stop.

JSON-LD (JavaScript Object Notation for Linked Data) is Google's recommended format. It lives in a <script> tag in your HTML head or body, which means it does not interfere with your visual markup. It is the easiest to implement, easiest to maintain, and most reliably parsed.

Microdata and RDFa require you to annotate your HTML elements directly with attributes — an approach that is fragile, hard to maintain, and increasingly unsupported by the tooling ecosystem.

The Schema Types That Matter Most in 2026

Organization

Every site should have an Organization schema on its homepage. This establishes your brand as a known entity and provides the structured information that feeds Google Knowledge Panels and LLM entity representations.

{
  "@context": "https://schema.org",
  "@type": "Organization",
  "name": "Surfaceable",
  "url": "https://surfaceable.io",
  "logo": "https://surfaceable.io/logo.png",
  "description": "AI visibility and SEO tracking platform for brands.",
  "sameAs": [
    "https://twitter.com/surfaceableio",
    "https://linkedin.com/company/surfaceable",
    "https://en.wikipedia.org/wiki/Surfaceable"
  ]
}

The sameAs property is particularly important — it links your entity to its representations on other platforms, helping LLMs build a complete picture of your brand.

Article and BlogPosting

All editorial content should have Article or BlogPosting schema. Include the author as a Person entity with a URL pointing to their author profile or personal site — this establishes authorship as a real entity, supporting E-E-A-T signals.

{
  "@context": "https://schema.org",
  "@type": "BlogPosting",
  "headline": "Schema Markup for AI: JSON-LD Strategies That Get You Cited by LLMs",
  "datePublished": "2025-11-11",
  "dateModified": "2025-11-11",
  "author": {
    "@type": "Organization",
    "name": "Surfaceable Team",
    "url": "https://surfaceable.io"
  },
  "publisher": {
    "@type": "Organization",
    "name": "Surfaceable",
    "logo": {
      "@type": "ImageObject",
      "url": "https://surfaceable.io/logo.png"
    }
  },
  "description": "Learn how schema markup improves AI citation rates.",
  "mainEntityOfPage": "https://surfaceable.io/blog/schema-markup-for-ai-json-ld-strategies"
}

FAQPage

FAQPage schema is one of the highest-impact markups for AI visibility. It directly maps to how LLMs think about content (question → answer) and is heavily used by Google for AI Overviews.

Apply it to any page that contains question-and-answer content — FAQ pages, product pages with a Q&A section, support articles, and How guides that naturally follow a question structure.

{
  "@context": "https://schema.org",
  "@type": "FAQPage",
  "mainEntity": [
    {
      "@type": "Question",
      "name": "What is schema markup?",
      "acceptedAnswer": {
        "@type": "Answer",
        "text": "Schema markup is structured data added to HTML that helps search engines and AI systems understand the meaning and context of your content."
      }
    },
    {
      "@type": "Question",
      "name": "Does schema markup help with AI citations?",
      "acceptedAnswer": {
        "@type": "Answer",
        "text": "Yes. Structured data helps LLMs that retrieve web content at query time to understand and extract your content more reliably, increasing the likelihood of citation."
      }
    }
  ]
}

HowTo

For procedural content — guides with numbered steps — use HowTo schema. LLMs are frequently asked step-by-step questions, and content marked up with HowTo has a structural advantage in retrieval systems.

{
  "@context": "https://schema.org",
  "@type": "HowTo",
  "name": "How to implement schema markup",
  "step": [
    {
      "@type": "HowToStep",
      "name": "Choose your schema type",
      "text": "Identify the most appropriate schema type for your content from schema.org."
    },
    {
      "@type": "HowToStep",
      "name": "Write your JSON-LD",
      "text": "Create a JSON-LD script block containing your structured data."
    },
    {
      "@type": "HowToStep",
      "name": "Add it to your page",
      "text": "Place the script tag in the head or body of your HTML."
    },
    {
      "@type": "HowToStep",
      "name": "Validate",
      "text": "Use Google's Rich Results Test to confirm your markup is valid."
    }
  ]
}

Product and Review

E-commerce sites and SaaS tools should implement Product schema with AggregateRating. LLMs asked to compare products or tools often retrieve product pages, and structured ratings data contextualises your position in the market.

BreadcrumbList

Breadcrumb schema communicates site structure to crawlers and helps AI systems understand the context of a page within your site's hierarchy. It is a low-effort, high-reward implementation.

SoftwareApplication

For SaaS products, SoftwareApplication schema provides relevant context — application category, pricing, operating systems supported, and aggregate ratings.

{
  "@context": "https://schema.org",
  "@type": "SoftwareApplication",
  "name": "Surfaceable",
  "applicationCategory": "BusinessApplication",
  "operatingSystem": "Web",
  "offers": {
    "@type": "Offer",
    "price": "49",
    "priceCurrency": "GBP"
  },
  "aggregateRating": {
    "@type": "AggregateRating",
    "ratingValue": "4.8",
    "reviewCount": "127"
  }
}

Common Mistakes to Avoid

Marking Up Content That Is Not Visible on the Page

Google and most AI parsers require that your structured data accurately represents the visible content. Do not mark up content that does not appear in the page's body — this can result in manual actions from Google and confuses AI parsers that cross-reference markup with visible text.

Using Multiple Conflicting Schema Types Incorrectly

A page can have multiple schema types in separate <script> tags, but they should not contradict each other. If you have a product page, the Product schema should accurately describe the same product visible on the page.

Omitting Required Properties

Each schema type has required and recommended properties. Omitting required properties prevents rich results from triggering and reduces the reliability of the structured data signal. Use Google's Rich Results Test and the Schema.org documentation to verify completeness.

Not Keeping Schema Updated

Dates, prices, and availability data go stale. Structured data that is contradicted by the visible page content is worse than no structured data.

A Practical Implementation Order

For most sites, implement schema in this order:

  1. Organization on homepage
  2. WebSite on homepage (enables Sitelinks Searchbox)
  3. BreadcrumbList across all pages
  4. Article/BlogPosting on all content pages
  5. FAQPage on any page with Q&A content
  6. Product or SoftwareApplication on product pages
  7. HowTo on procedural guides
  8. Person on author bio pages

Measuring the Impact

After implementing schema, monitor:

  • Google Search Console → Enhancements → check for validation errors and rich result impressions
  • Google Rich Results Test → validate any page immediately after implementation
  • AI visibility tracking → use a tool like Surfaceable to track whether your pages are appearing more frequently in AI-generated answers over time

Schema is a compounding investment. Each correctly implemented schema type adds machine-readable context to your content, making it progressively easier for both search engines and AI systems to understand and cite your site.

Conclusion

Schema markup is one of the most concrete, controllable actions you can take to improve your visibility in AI-generated answers. It is not a magic bullet, but it is a reliable signal — one that AI retrieval systems are specifically designed to use. Implement the core schema types correctly, keep them accurate, and treat structured data as a first-class citizen in your content production workflow. The brands doing this consistently are building a technical advantage that compounds over time.


Try Surfaceable

Track your brand's AI visibility

See how often ChatGPT, Claude, Gemini, and Perplexity mention your brand — and get a full technical SEO audit. Free to start.

Get started free →