A technical SEO audit is a systematic review of the factors that affect how search engines — and increasingly, AI crawlers — can find, understand, and rank your site. Done well, it surfaces actionable issues ranked by impact. Done poorly, it produces a list of 200 "issues" that keep your developers busy fixing things that do not actually affect rankings.
This guide covers the essential components of a technical SEO audit in 2026, in a practical order that prioritises impact.
Before You Start: Set Your Scope
A complete technical audit covers everything from crawlability to Core Web Vitals. For large sites (100k+ pages), a full audit is a months-long project. For most sites, a focused audit of the most impactful areas is more practical.
Define scope before you start:
- What is the primary goal? (Fix ranking drops? Improve crawl efficiency? Prepare for migration?)
- How large is the site? (This determines how much data you need to collect)
- What tools are available? (A Screaming Frog licence is essentially required; Google Search Console is free and essential)
Tools You Need
Essential:
- Google Search Console — indexing data, crawl errors, Core Web Vitals field data, rich results
- Screaming Frog SEO Spider (or Sitebulb) — site crawler for comprehensive technical data
- PageSpeed Insights — page experience data and Lighthouse reports
- Google Chrome DevTools — for diagnosing specific technical issues
Useful additions:
- Ahrefs or Semrush — backlink data, crawl data, ranking data
- Bing Webmaster Tools — relevant if Bing traffic matters; also important for AI search (Bing powers some AI retrieval)
- Screaming Frog Log Analyser — for Googlebot crawl log analysis on large sites
Step 1: Crawl the Site
Start with a full site crawl using Screaming Frog (or equivalent). Configure it to:
- Crawl JavaScript (enable Screaming Frog's JavaScript rendering mode)
- Include subdomains if relevant
- Set a reasonable crawl speed (respect your server's capacity)
- Follow redirects but report them
The crawl gives you a baseline dataset for the rest of the audit.
Initial checks from the crawl:
- Total number of URLs discovered (expected vs actual?)
- How many URLs return a 200 status? 4xx? 5xx?
- Are there redirect chains (301 → 301 → 200)?
- Are there redirect loops?
- How many pages have duplicate
<title> tags? Duplicate <meta description> tags?
Step 2: Indexability Review
Open Google Search Console → Index → Pages. This report shows:
- Indexed pages — pages Google has indexed
- Not indexed pages — with reasons (404, noindex, blocked by robots.txt, duplicate without canonical, etc.)
Key questions:
- How many of your important pages are indexed? If important pages are not indexed, why?
- Are any pages indexed that should not be? (Test environments, thin pages, parameterised URLs)
- Are there "Discovered — currently not indexed" pages? These are known to Google but not crawled — may indicate crawl budget issues.
Cross-reference with your sitemap: every URL in your sitemap should be indexed (unless recently removed).
Step 3: robots.txt Audit
Review your robots.txt file manually at yourdomain.com/robots.txt.
Check for:
- Any
Disallow rules that might be blocking important pages or sections
- Whether your sitemap is referenced
- AI crawler directives — are GPTBot, PerplexityBot, ClaudeBot explicitly allowed or blocked? Is this intentional?
- Test-environment URLs that might be accidentally crawlable
Test specific rules with Google's robots.txt tester in Search Console (Settings → robots.txt).
Step 4: XML Sitemap Audit
Review your sitemap structure:
- Is the sitemap submitted to Google Search Console?
- Does it include all important, indexable pages?
- Does it exclude non-canonical pages, paginated pages (beyond page 1), and noindex pages?
- Are
lastmod dates accurate? (Inaccurate lastmod is worse than no lastmod — it signals unreliability)
- Is the sitemap itself accessible? Return a 200 status?
- For large sites: is it under the 50,000 URL / 50MB per file limit?
Step 5: HTTPS and Security
- Does the root domain redirect HTTP → HTTPS?
- Are there any pages still served over HTTP?
- Is there mixed content on any HTTPS pages? (HTTP resources embedded in HTTPS pages)
- Is the SSL certificate valid and not close to expiry?
- Is HSTS enabled?
Run your root domain through SSL Labs (ssllabs.com/ssltest) for a comprehensive security assessment.
Step 6: Core Web Vitals Assessment
Field Data (What Actually Matters)
In Google Search Console → Experience → Core Web Vitals, review:
- LCP status: what percentage of pages are "Good" (<2.5s)?
- INP status: what percentage are "Good" (<200ms)?
- CLS status: what percentage are "Good" (<0.1)?
- Any specific URL groups flagged as "Poor"?
Field data is what Google uses for rankings. This is your primary source of truth.
Lab Data (For Diagnosis)
Run representative pages through PageSpeed Insights to get Lighthouse scores and specific recommendations. Identify the specific issues causing field data failures:
- LCP issues: image loading, server response time, render-blocking resources
- INP issues: long JavaScript tasks, heavy event handlers, third-party scripts
- CLS issues: images without dimensions, dynamically injected content, web font loading
Step 7: Structured Data / Schema Audit
Coverage: What schema types are implemented across the site? Are all content pages using Article or BlogPosting? Do product pages have Product schema? Does the homepage have Organization schema?
Validity: Run a sample of pages through Google's Rich Results Test. Are there errors or warnings?
Accuracy: Does the schema data match the visible page content? (Google requires this)
Completeness: Are required properties present for each schema type?
Check Google Search Console → Enhancements for a site-wide view of schema errors.
Step 8: Canonical Tag Audit
Canonical tags prevent duplicate content issues. Audit for:
- Pages with no canonical tag (should be self-canonical for all indexable pages)
- Canonical tags pointing to redirected URLs (should point to the final destination)
- Canonical chains (A canonicals to B which canonicals to C — canonicalise directly to C)
- Incorrect cross-domain canonicals
- Paginated pages canonicalising to page 1 (incorrect — use proper pagination signals instead)
Step 9: Internal Link Structure
Using your Screaming Frog crawl data:
- Identify orphan pages (no internal links pointing to them)
- Identify pages with very few internal links (undervalued relative to their importance?)
- Check for broken internal links (links to 404 pages)
- Review anchor text distribution — is it descriptive?
- Identify redirect links in navigation (always link to the final URL)
Step 10: Page Speed and Server Performance
Beyond Core Web Vitals:
- Average TTFB across the site — target under 600ms
- Server response rates under load (especially relevant if you have traffic spikes)
- CDN configuration and coverage
- Image optimisation — are images appropriately sized and in modern formats (WebP, AVIF)?
- Font loading strategy — are web fonts causing render delay?
Step 11: Mobile Usability
Google indexes the mobile version of your site. Check:
- Google Search Console → Experience → Mobile Usability — any errors?
- Do key pages pass mobile-friendly testing?
- Is content identical on mobile and desktop? (Differences may cause indexing issues)
- Are tap targets appropriately sized?
- Is viewport meta tag correctly set?
Step 12: International SEO (if applicable)
For sites serving multiple languages or regions:
- Is hreflang implemented correctly? (Return links, x-default, correct language codes)
- Are language/region variations canonicalised correctly?
- Are there geotargeting settings in Google Search Console?
Step 13: JavaScript SEO
For JavaScript-heavy sites:
- Is critical content (navigation, main body, headings) present in the raw HTML source (view-source:)?
- Are internal links present as standard
<a href=""> elements in the HTML?
- Is schema markup rendered in the raw HTML, not injected via JavaScript after load?
Test pages with JavaScript disabled in Chrome DevTools to see the "crawler view" of your content.
Step 14: AI Crawler Accessibility
New for 2026 — explicitly review your site's AI crawler accessibility:
- Is llms.txt implemented at the root?
- Are AI crawlers (PerplexityBot, ClaudeBot, GPTBot) correctly configured in robots.txt?
- Do any key pages have JavaScript rendering that would block AI crawlers?
- Are important pages reachable within 3 clicks from the homepage?
Prioritising Your Findings
Every audit produces more issues than can be fixed at once. Prioritise by:
- Indexability issues — pages that should be indexed but are not
- Core Web Vitals failures — especially mobile field data
- Structured data errors — on high-traffic or high-priority pages
- Broken links and redirect chains — particularly in navigation
- Canonical issues — on high-value pages
- Page speed — server response time and image optimisation
Issues that affect many pages (a canonical tag error on a template) take priority over one-off issues.
Conclusion
A well-executed technical SEO audit is not a one-time event — it is the beginning of an ongoing programme. Conduct a full audit twice yearly and a lighter maintenance check monthly (new 404s, indexing changes, Core Web Vitals drift).
Track your progress by the metrics that matter: indexed page count, Core Web Vitals field data pass rate, and Search Console error counts. And increasingly, supplement your Google-focused audit with AI crawler accessibility checks — because in 2026, technical SEO serves two audiences, and both matter.
Try Surfaceable
Track your brand's AI visibility
See how often ChatGPT, Claude, Gemini, and Perplexity mention your brand — and get a full technical SEO audit. Free to start.
Get started free →