The GEO Audit: 12 Checks Before You Write Another Post
A pattern we keep running into: a content team hears that AI search is the next frontier, commissions twenty "AI-optimized" articles, publishes them — and three months later, nothing. No citations in ChatGPT, no presence in AI Overviews, no referral trickle from Perplexity. When we finally look at the site, the explanation is rarely in the articles. It is in the plumbing. The pages render through client-side JavaScript that most AI crawlers never execute. The firewall has been silently dropping GPTBot for a year. The brand's founding date is different on the site, on LinkedIn, and in the one directory the language models actually trained on. The new content was poured into a container that leaks.
That is why the right first move in generative engine optimization is not writing. It is auditing. An audit costs a day or two; a quarter of content written onto a broken foundation costs a quarter. The framework below is the one we use before any GEO content engagement — twelve checks, grouped into four areas, ordered roughly by how fatal a failure in each one is. For each check you get four things: what to verify, how to verify it with concrete methods, what a pass looks like, and the failure we find most often in the wild.
One framing note before we start. GEO is not a separate universe from SEO — the engines that generate answers still depend on crawling, indexing, and trust signals, which is why roughly half of these checks will look familiar to anyone who has run a technical SEO audit. The difference is in what each check is testing for. If the distinction between the two disciplines is still fuzzy, our companion piece on what actually changes between GEO and SEO draws the line in detail; this article assumes the line and walks the checklist.
A GEO audit is a structured review of whether AI search engines can access, extract, trust, and cite your content. It covers four areas — crawler access and rendering, passage-level content structure, entity and authority signals, and measurement — producing a prioritized fix list you complete before investing in new AI-targeted content.
Group A: Access and retrievability — can the engines reach you at all?
Everything else in this audit is irrelevant if AI systems cannot fetch and read your pages. Access failures are binary and total: a blocked crawler does not cite you less, it cites you never. These three checks come first because a failure here invalidates all downstream effort, and because they are also the failures teams are least likely to suspect — nobody assumes their own site is invisible.
1. Crawler access: robots.txt, CDN, and firewall rules
What to verify. That the user-agents of the major AI systems are permitted to fetch your content — in robots.txt, but also at the CDN and web application firewall layer, where blocking is invisible from the outside. The agents that matter as of 2026: GPTBot (OpenAI's training crawler), OAI-SearchBot (the separate crawler behind ChatGPT Search — many teams block GPTBot for training reasons without realizing this second agent is the one that affects whether ChatGPT can cite them), PerplexityBot, and ClaudeBot (Anthropic). One critical nuance that trips up even experienced teams: Google-Extended is a robots.txt token that governs whether your content trains Gemini models. Blocking it does not remove you from AI Overviews, because AI Overviews are built on Google's ordinary search index, crawled by plain Googlebot. We regularly meet teams who blocked Google-Extended believing they had opted out of AI Overviews, and teams who left it open believing that was what kept them in. Both are wrong; the lever they are reaching for does not control the thing they care about.
How to check. Read robots.txt line by line — it takes five minutes. Then check the layers robots.txt cannot show you: pull a sample of server logs or CDN logs and search for each user-agent string. If GPTBot or PerplexityBot appears with rows of 403 or 429 responses — or does not appear at all over a thirty-day window on a site of any real size — something upstream is rejecting them. Many CDN providers have shipped one-click "block AI bots" toggles, and we have seen these enabled by a well-meaning developer during a scraping scare and never revisited. Finally, fetch a key page yourself with each bot's user-agent string via curl and confirm you get a 200 with full HTML, not a challenge page.
What pass looks like. Every AI crawler you have decided to allow returns 200s in your logs, robots.txt reflects a deliberate policy rather than an accident, and whoever owns the CDN config can tell you what the bot-management settings are.
Common failure. The silent firewall block. Robots.txt looks welcoming, the team believes they are open, and the WAF has been serving 403s to every AI user-agent since someone toggled a setting eighteen months ago. From the outside everything looks fine, which is exactly why it survives so long.
2. Rendering dependence: does your content exist without JavaScript?
What to verify. Whether the substantive content of your key pages is present in the raw HTML the server returns, or whether it only materializes after client-side JavaScript runs. Googlebot renders JavaScript, eventually and at a cost. Most AI crawlers, as far as their observable behavior shows, largely do not — they fetch the HTML response and parse what is there. If your pricing table, your definitions, your product explanations live inside a client-rendered application shell, then to GPTBot and PerplexityBot your page is a header, a footer, and an empty div.
How to check. The lowest-tech method is the most reliable: curl the URL, or use your browser's view-source (not the rendered inspector — the actual source), and search for a distinctive sentence from the page's main content. If you can find your key paragraphs in the raw response, you pass. Do this for each page template — homepage, blog post, product page, documentation — because rendering strategy often differs by template. A site can be server-rendered on its marketing pages and fully client-rendered on the docs that contain everything worth citing.
What pass looks like. Every template that carries citable content returns that content in the initial HTML response. Server-side rendering, static generation, or plain old HTML all pass; what matters is the payload, not the framework.
Common failure. The half-rendered page. The shell and the first paragraph are server-rendered, so spot checks pass — but the FAQ accordion, the comparison table, and the customer-facing data load via a client-side fetch. The page's most extractable assets are precisely the parts the AI crawlers never see.
3. Index hygiene: you cannot be cited if you are not indexed
What to verify. That the pages you want cited are actually in the indexes the engines draw from. AI Overviews work from Google's index; ChatGPT Search and Perplexity maintain their own crawl-and-index pipelines but inherit many of the same hygiene dependencies. Three specific things to inspect: canonical tags (is any key page canonicalized to a different URL, telling engines to ignore it?), stray noindex directives (in meta tags or HTTP headers — header-level noindex is invisible in view-source and routinely missed), and sitemap freshness (does the sitemap include your recent content, return 200, and avoid listing redirected or dead URLs?).
How to check. Start in Google Search Console's page indexing report and read the excluded categories — "Duplicate without user-selected canonical" and "Excluded by noindex" are the two that hide accidents. Pick your ten most important pages and verify each one individually: inspect the URL, confirm it is indexed, confirm the Google-selected canonical matches the one you declared. Check response headers for X-Robots-Tag on a sample of templates. Then open the sitemap itself and spot-check that the newest entries are real, live, 200-serving URLs.
What pass looks like. Your money pages are indexed under their intended canonical URLs, exclusion reports contain only deliberate exclusions, and the sitemap is a current, accurate map rather than an archaeological record.
Common failure. The canonical conflict introduced by a parameter or a trailing-slash inconsistency, where the page exists in the index — just under a URL nobody links to, with the authority split across two addresses. The page ranks weakly, gets cited never, and every report says it is "indexed."
Group B: Content structure — can an engine lift an answer out of your page?
Generative engines do not cite pages; they cite passages. When a system assembles an answer, it retrieves chunks of text — a paragraph, a list, a table — scores them for relevance and answer quality, and synthesizes from the winners. This is the mechanical reality behind every piece of GEO advice about structure, and it is covered from the engine's side in our complete guide to generative engine optimization. The audit question for this group is blunt: if a machine tore any given paragraph out of your page and showed it to a stranger, would it still work?
4. Answer-first paragraphs
What to verify. That each key page answers its central question in the first 40–60 words beneath the relevant heading — before the context, the caveats, and the wind-up. Retrieval systems favor passages where the answer is densely concentrated near the query terms; readers, incidentally, favor the same thing.
How to check. Take your twenty most important pages. For each, write down the single question the page exists to answer. Then read only the first two sentences under the title and under each major H2, and ask: is the question answered yet? Be strict — "it depends, and in this article we will explore" is not an answer. We score each page answer-first / answer-buried / answer-absent and tally.
What pass looks like. The majority of key pages deliver a direct, self-sufficient answer in the opening lines, with elaboration after — the shape this very article used in its fourth paragraph.
Common failure. The thousand-word throat-clear: history of the topic, why the topic matters, what this guide will cover — with the actual answer first appearing at word 800, well past where extraction systems and humans alike have stopped reading.
5. Question-shaped headings mapped to real queries
What to verify. That your H2s and H3s correspond to questions people actually ask, phrased close to how they ask them. Headings are the strongest structural signal of what a passage is about; a heading that matches a real query gives the passage beneath it a head start in retrieval for that query.
How to check. Export your headings (a crawler tool does this in bulk, or grep your HTML for heading tags). Set them against query data: Search Console queries containing question words, People Also Ask entries for your core terms, and the question sets you have mined before — the method in our piece on question keywords applies directly here. Count how many real, recurring questions in your space have no corresponding heading anywhere on your site, and how many of your headings are clever-but-unsearchable ("Let's talk turkey") rather than query-shaped ("How long does a GEO audit take?").
What pass looks like. Your important question queries each map to a specific heading on a specific page, and a stranger reading only your heading hierarchy could reconstruct what questions the page answers.
Common failure. Headings written as chapter titles for a reader already inside the narrative — "The plot thickens," "Going deeper" — which are invisible to retrieval and unhelpful even to skimming humans.
6. Self-contained passages
What to verify. That individual paragraphs survive being lifted out of context. A retrieved chunk arrives alone: if it says "as mentioned above, this approach also fails," the engine has a sentence about nothing. Pronouns without antecedents, unnamed subjects ("the tool," "this method"), and references to earlier sections all degrade a passage's standalone value — and standalone value is what gets a passage selected and cited.
How to check. The lift test. Take ten paragraphs at random from key pages, paste each into a blank document, and read it cold. Does it name its subject? Could you tell what product, company, or concept it is about without the surrounding page? We mark each paragraph pass or fail; under roughly seven of ten passing, the site has a systemic writing-habit issue rather than isolated slips.
What pass looks like. Key entities are re-named at reasonable intervals (the product name, not just "it"; "a GEO audit," not just "this process"), and each paragraph carries one complete, attributable idea.
Common failure. Writing optimized for linear reading by a loyal human — graceful, flowing, pronoun-rich — that atomizes into meaningless fragments the moment a machine chunks it.
7. Extractable formats: lists, tables, definitions, steps
What to verify. That content which is logically structured is also visually and semantically structured. A comparison buried in prose is hard to extract; the same comparison as a table is trivially extractable. Definitions, numbered procedures, pros-and-cons lists, and spec tables are the formats generative engines lift most cleanly — and the formats that win conventional rich results too, which is why this check overlaps with the groundwork in our guide to structured data and rich results.
How to check. For each key page, list the inherently structured ideas it contains — anything that is a set, a sequence, a comparison, or a definition. Then check whether each one is marked up as actual list, table, or step elements in the HTML, or merely narrated in paragraphs. The ratio of structured-ideas-formatted-as-structure to structured-ideas-trapped-in-prose is the score.
What pass looks like. Every set is a list, every comparison is a table, every procedure is numbered steps, and definitions sit in single crisp sentences an engine could quote verbatim without editing.
Common failure. The wall of prose that contains a genuinely excellent five-step process — which no engine will ever extract, because extracting it would require the engine to do the structuring the author skipped.
Group C: Trust and entity signals — why would an engine cite you over anyone else?
Access gets you fetched; structure gets you extracted; this group determines whether you get chosen. Generative engines are conservative citers — they synthesize from sources they can attribute and prefer sources whose claims are corroborated elsewhere. These checks audit whether your site reads, to a machine assembling a trust picture from many scattered signals, like a source worth attaching a name to.
8. Author and E-E-A-T surface
What to verify. That your content is visibly attached to identifiable, credible humans with evidence of first-hand experience. Bylines that link to real author pages; author pages that state credentials, role, and history; content that contains the texture of actual practice — screenshots, observed numbers, "when we ran this" specifics — rather than competently rearranged common knowledge. These are the signals Google's quality systems reward, as unpacked in our analysis of what E-E-A-T actually rewards, and they feed the same trust assessment that decides citations.
How to check. Audit a sample of twenty articles: does each have a byline? Does the byline resolve to a page that says anything verifiable? Does the author exist anywhere off your site — LinkedIn, conference talks, other publications? Then read three articles asking one question: what sentence here could only have been written by someone who has actually done this? If the honest answer is "none," the experience signal is absent regardless of how the pages are decorated.
What pass looks like. Named authors with corroborating off-site presence, and content where first-hand evidence is structural, not ornamental.
Common failure. Every post bylined "Team" or "Admin," authored by a rotation of freelancers, with author pages that are a name and a stock avatar. To an engine triangulating trust, this content is anonymous — and anonymous sources lose ties.
9. Original, citable assets
What to verify. That your site contains things engines structurally prefer to attribute: original data points, benchmarks you measured, named frameworks you coined, and quotable one-sentence definitions. Synthesis engines need sources for claims; a specific number or a crisp definition demands attribution in a way that generic advice never does. "Email open rates vary" needs no citation. "In our sample of 400 audits, 31% of sites blocked at least one AI crawler unintentionally" — a claim like that, if you have genuinely measured it, must be cited to be used.
How to check. Inventory your site for original assets: anything measured, surveyed, benchmarked, or defined by you rather than summarized from elsewhere. Count them. Then take your five most strategically important claims and ask whether each is stated in a single self-contained, quotable sentence anywhere on the site, or only ever implied across three paragraphs.
What pass looks like. A handful of genuinely original, regularly updated data assets, each anchored by quotable summary sentences, each living at a stable URL an engine can attach attribution to.
Common failure. A hundred articles of synthesis containing zero claims that originate with you. Such a site can be perfectly accurate and perfectly unciteable: every fact on it is available from an older, more authoritative source, so the engine cites the older source.
10. Entity consistency and structured data parseability
What to verify. That the basic facts about your company and products are identical everywhere a machine might learn them: your site, LinkedIn, app marketplaces, business directories, and the Wikipedia-adjacent tier of sources that training corpora and retrieval systems lean on. What you do, what category you are in, when you were founded, what the product is called — stated once, the same way, everywhere. Language models build entity representations by aggregation; contradictions dilute the entity into vagueness, and vague entities do not get recommended. Alongside this, verify your structured data parses: Organization and product schema that validates, matches the visible page, and asserts the same facts as everywhere else.
How to check. Build a one-page fact sheet — name, category, description, founding facts, product names, key numbers. Then visit every profile and listing you control and diff each against the sheet. Run your key templates through a schema validator and check not just that markup validates but that its claims agree with the visible content and the fact sheet. Finally, ask the major chat engines "what is [your company]?" and note which facts come back wrong — each error usually traces to a stale source you can fix.
What pass looks like. One canonical set of facts, repeated verbatim across every controlled surface, mirrored in valid schema, and reflected accurately when engines describe you.
Common failure. Drift. The site says one positioning (rewritten last quarter), LinkedIn says the previous one, a directory from 2021 says you are in a category you exited, and the schema was written by an agency two pivots ago. No single statement is false enough to fix, so the contradiction persists indefinitely.
Group D: Measurement and monitoring — would you even know if it worked?
The first three groups are interventions; this one is instrumentation. GEO suffers from a real measurement gap — citations do not appear in rank trackers, and much AI influence never produces a click — but "hard to measure" is not "unmeasurable." A team with no measurement layer cannot tell whether any of the preceding nine fixes did anything, which makes these two checks the difference between a strategy and a guess.
11. AI referral tracking in your analytics
What to verify. That your analytics can isolate traffic arriving from AI surfaces. Clicks from ChatGPT arrive with chatgpt.com as the referrer; Perplexity sends perplexity.ai; Copilot referrals identify as coming from Microsoft's Copilot domains. None of these are broken out by default in GA4 — they sit inside generic referral buckets unless you build segments. And know what you cannot see: Google Search Console does not separate AI Overview impressions or clicks from ordinary web search; they are blended into the same rows, so GSC can confirm overall search health but cannot tell you your AI Overview performance specifically. Anyone selling you a precise "AI Overview traffic number" from GSC is reading data that does not exist. For how AI Overviews actually behave on the results page, see our complete guide to AI Overviews.
How to check. Open GA4 traffic acquisition, filter session source for chatgpt, perplexity, and copilot, and see whether anything is there. If you have never looked, the history is already collected — referrer data accrues whether or not you segment it. Build a saved exploration or a channel group that pools AI referrers, and annotate baseline volume and the landing pages receiving it. The wider GA4 setup this slots into is covered in our GA4 guide for SEOs.
What pass looks like. A persistent, monitored AI-referral segment with a known baseline, reviewed on the same cadence as organic reporting — and a team that can answer "which pages do AI engines send people to?" without an ad-hoc investigation.
Common failure. The data exists, segmented by nobody, and the team's entire belief about whether AI search matters to their business is a mood rather than a number they could check in ninety seconds.
12. Citation spot-checks on your money queries
What to verify. Whether you — and your competitors — are actually being cited, right now, for the queries that drive your business. Referral tracking captures clicks; citation checking captures presence, including the presence that never clicks. This requires going to the engines directly, because no analytics product observes what ChatGPT says about your category.
How to check. Build a fixed prompt set: ten to twenty queries spanning your category ("best X for Y"), your problems ("how do I solve Z"), and your brand. Run the set on a recurring schedule — monthly at minimum — across ChatGPT Search, Perplexity, and Google with AI Overviews triggered. Log, for each query and engine: were you cited, who was, and what was said. Keep the prompt set stable so the log becomes a trend line rather than a pile of anecdotes; expect noise run-to-run and read direction, not single results. The tactics for converting presence into citations are their own subject — covered in our guide to getting cited by ChatGPT, Gemini, and Perplexity — but the audit question is simply whether this log exists.
What pass looks like. A standing prompt set, a dated log, and a citation share trend you can place next to your fix list to see what moved.
Common failure. One excited founder query in January ("ChatGPT mentioned us!"), screenshot circulated, never repeated, no record of which competitors were cited alongside — enthusiasm where instrumentation should be.
Scoring the audit and ordering the fixes
Twelve checks produce a messy pile of findings; the audit only becomes useful when the pile becomes a sequence. We score each check simply — pass, partial, fail — and then sort the failures by a two-axis logic of impact against effort. Resist the urge to fix things in the order you found them, because the groups differ enormously in both dimensions.
Access failures are blockers, and blockers go first — always. A failed check 1, 2, or 3 multiplies everything else by zero. They are also, mercifully, usually cheap: a robots.txt edit, a CDN toggle, a canonical correction can each be a same-day fix. There is no scenario in which writing new content is the right move while an access failure is open. This is the easiest prioritization call in the entire framework, and it is the one most often gotten wrong, because writing content feels like progress and editing a firewall rule does not.
Structure failures are quick wins. Checks 4 through 7 fail page by page, which means they can be fixed page by page — no platform migration, no organizational change. Rewriting the opening paragraph of your twenty most important pages to answer-first form is perhaps two days of an editor's time and is, in our experience, the highest-leverage two days in GEO. Converting prose comparisons to tables, renaming headings to match real questions, breaking pronoun chains so passages stand alone: each is an afternoon-scale edit with passage-level effect. Sort these by page value — fix the pages that matter, not the pages that are easiest.
Trust and entity failures are slow compounding plays. Checks 8 through 10 cannot be fixed in a sprint. Author credibility accrues; original data assets take a cycle to produce; entity consistency involves chasing profiles you half-control and waiting for the wider web to catch up. Start them early precisely because they are slow — the entity cleanup you begin today is the citation behavior you observe two quarters from now — but never let their slowness block the quick wins, and never confuse "started" with "done." These are standing programs, not tickets.
Measurement failures are cheap and urgent in a different way. Checks 11 and 12 cost a few hours and gate your ability to learn. Fix them in week one regardless of what else is broken, because every week without a baseline is a week of effects you will never be able to attribute. If you fix access in week two and citations rise in week six, you only know that because the measurement layer existed in week one.
One optional item sits outside the twelve because it is neither load-bearing nor harmful: llms.txt, a proposed convention for offering AI systems a clean, markdown-style index of your most important content. The honest status as of mid-2026: it is a proposal, and no major engine — not Google, not OpenAI, not Anthropic — has committed to using it as a ranking or retrieval input. Adding one takes five minutes and costs nothing, so we file it as a harmless hedge. Just do not let anyone sell it to you as a fix for anything in groups A through D, because there is no evidence it substitutes for any of them.
The decision rule, before you write another post
Here is the rule we apply at the end of every audit, and the sentence this entire framework compresses into: new content is only justified when Group A passes clean, the pages that would link to and surround the new content pass Group B, and Group D exists. Access clean, structure decent, measurement on — write. Anything in Group A failing — stop, fix it this week, then write. Group B failing broadly — split your effort, because retrofitting your existing top pages will outperform adding new ones to a site whose format fights extraction. Group C failing — write anyway, but point the new content at the gap: make the next piece an original-data asset with a named author rather than another round of synthesis.
The uncomfortable arithmetic behind the rule is that content production is the most expensive activity in this entire discipline, and it is the one teams default to because it is the one they already know how to do. An audit day that prevents a wasted content quarter is the best trade available in GEO right now. Whether the channel justifies the investment at all is a fair strategic question — we have argued both sides in our piece on optimizing for engines that send no traffic — but if you are going to do this work, doing it onto a foundation you have actually inspected is not optional diligence. It is the work.
Run the twelve checks. Score them honestly. Fix the blockers this week, schedule the quick wins this month, start the compounding plays this quarter, and turn the measurement on before any of it. Several of these checks are exactly the kind of recurring, structured verification that should not depend on someone remembering to do them — crawler access, index hygiene, referral segments, citation logs — and that is where an SEO AI agent platform like Orova earns its keep, running the repeatable parts of this audit automatically and on schedule so your team's attention goes to the judgment calls: what to claim, what to measure, and what to write once the foundation finally deserves it.
Let an AI Agent handle your SEO
Orova plans, writes, optimizes, and tracks rankings on its own — you just read the results.
Try it free