
I Spent $10K on AEO and Got Zero AI Citations. Here Is the Audit Section That Would Have Caught Why.
citability.dev now scores Wikipedia, Wikidata, and JSON-LD sameAs presence. Free, opt-in, under 10s. Part of the AVR Framework, see chudi.dev/framework.
Why this matters
citability.dev now scores Wikipedia, Wikidata, and JSON-LD sameAs presence. Free, opt-in, under 10s. Part of the AVR Framework, see chudi.dev/framework.
I spent 106 hours and $10,000 on AEO infrastructure that produced zero AI citations. The infrastructure was correct. I had llms.txt, JSON-LD, Speakable schema, HowTo blocks, FAQs, structured headings. The site audited at 9 of 9 on AI infrastructure.
ChatGPT still wouldn’t recommend me. Perplexity wouldn’t link to me. Claude didn’t know I existed.
The obvious fix was “write more content.” I tried that. 48 blog posts later, citation count was still zero. The next obvious fix was “build more backlinks.” I tried that too. The DR climbed. The citations did not.
Content volume is downstream of entity recognition. Entity recognition is governed by graphs I do not own. The missing signal was off-site authority. Specifically: Wikipedia, Wikidata, and the entity graph that AI systems trust because it was built by humans, not by my own marketing team.
That gap is now a citability.dev audit section.
What Phase 1 measures
Phase 1 of Section 5 ships three checks. All three are VERIFIABLE (deterministic detection, not best-effort), all three sit at signal-hierarchy rank 10 (the highest weight in the AI Visibility Readiness (AVR) Framework, see chudi.dev/framework), and all three target the entity-recognition layer that the prior four AVR sections completely ignored.
- Wikipedia entry exists for the brand or operator name. Detection via Wikipedia API title search, with a disambiguation-page filter so “Apple (disambiguation)” does not count as a real entry. Pass = at least one mainspace article matches.
- Wikidata entry exists AND links to the brand domain via P856 (official website) or P973 (described at URL). Detection via Wikidata SPARQL with a wbsearchentities + wbgetentities fallback when the SPARQL endpoint is slow. Pass = at least one Q-item links to the audited domain.
- Schema sameAs completeness on the homepage. Detection via JSON-LD extract. Pass = a Person or Organization node lists at least 4 of: LinkedIn, GitHub, Wikipedia, Wikidata, X / Twitter, Medium.
Section verdict bands for Phase 1:
- PASS: 3 of 3 pass
- PARTIAL: 1 to 2 pass
- FAIL: 0 pass
Recommendations are auto-generated from failed checks, ordered by signal hierarchy rank.
How to run it
The audit is opt-in via a new /citability section5 operation. It runs in under 10 seconds against a clean URL, makes only free API calls (Wikipedia + Wikidata + a single homepage fetch), and emits a JSON object that merges into the existing AVR audit shape under the section_5_off_site_authority key.
python3 section5_off_site_authority.py https://your-domain.example
--brand "Your Brand"
--owner "Your Name"
-o section5.json Or via the existing /citability skill:
/citability section5 https://your-domain.example What I learned running it on the first two real domains
I ran Phase 1 against two reference domains while validating the build. Both came back PARTIAL 2/3. Neither came back the way the spec predicted. Watch what happens.
| Site | Wikipedia | Wikidata | sameAs (>=4 of 6) | Verdict | Time |
|---|---|---|---|---|---|
| chudi.dev | PASS | FAIL (no Q-item yet) | PASS (4 categories) | PARTIAL 2/3 | 7.3s |
| github.com | PASS | PASS | FAIL (no JSON-LD on homepage) | PARTIAL 2/3 | 6.8s |
The github.com result was the surprise. The original spec assumed github.com would pass all three checks and be the canonical “fully passing” reference fixture. github.com does have a Wikipedia article and a Wikidata Q-item with P856 -> github.com. What it does NOT have, as of today, is any JSON-LD on its homepage. The check correctly returns FAIL for sameAs because the schema simply is not there.
That is exactly the kind of finding the audit exists to surface. A domain ranked 99 on the open web can still be invisible to AI systems that consult schema markup for entity disambiguation. DR does not predict entity readability.
The chudi.dev result tells me what I am fixing next: create a Wikidata Q-item linked to chudi.dev via P856. Once that lands, chudi.dev moves from PARTIAL 2/3 to PASS 3/3 and the audit becomes a working dogfood example.
Why this is its own section, not a row in an existing one
Off-site authority is the upstream cause; the other four AVR sections measure downstream effects. SEO Foundation measures my robots.txt. AI Infrastructure measures my schema markup. Citation Monitoring tells me whether AI cites me right now. AI Visibility tells me whether AI knows I exist. None of them ask whether the entity graph that AI systems consult contains my brand at all.
The AVR Framework already had four sections:
- SEO Foundation
- AI Infrastructure
- Citation Monitoring
- AI Visibility
Adding off-site authority as a fifth section, and giving it real verdict weight (PASS or PARTIAL is now required for an AI-READY overall verdict), reflects the order in which the signals actually matter. Building schema before the entity graph recognizes you is like writing a better resume for a job application that never reaches the hiring manager.
What edge cases the implementation handles
Three failure modes will bite anyone building this. The audit handles them so the verdict never lies.
First: Wikipedia disambiguation pages. A search for “Apple” returns the disambiguation page as a top hit. That is not a real entity entry. The check filters titles containing “disambiguation” and snippets containing “may refer to” before counting a pass.
Second: the Wikidata SPARQL endpoint is intermittently slow. A 4-second timeout fails far too often. The implementation gives SPARQL 6 seconds, then falls back to wbsearchentities + wbgetentities to verify P856 / P973 directly. In practice the fallback path catches the cases SPARQL drops.
Third: SSRF and broad-host queries. The homepage fetcher refuses non-http(s) schemes (no file://, no ftp://). The SPARQL filter refuses host strings shorter than three characters because CONTAINS(STR(?website), "") would match every website in Wikidata and return a misleading PASS.
What is not in Phase 1
Phase 2 will add mention counts (Tier 1 and Tier 2 publications) and review platforms (G2, Trustpilot). Phase 3 will add Reddit, YouTube, listicles, LinkedIn company page health, and HARO / Qwoted coverage trails. Phase 4 wires the whole section into /citability full’s composite verdict.
I shipped Phase 1 alone because it closes the worst gap (entity recognition was completely absent from AVR before today) and because the three checks have stable, free APIs. The other phases depend on third-party data sources that can rate-limit or change. Ship the smallest correct thing today, not the right thing in a month.
Where this fits in the brand split
citability.dev OPERATIONALIZES AVR. It is the audit tool, the consulting deliverable, the LinkedIn lead magnet. chudi.dev OWNS AVR as intellectual property, the framework page is at chudi.dev/framework. This post lives on chudi.dev because it announces a framework evolution, not a product feature ship in isolation.
If you want the audit, run it on citability.dev. If you want to understand why the framework looks the way it does, read chudi.dev/framework.
FAQ
Why are these three checks “VERIFIABLE” and not “BEST-EFFORT” like the rest of Section 5?
Wikipedia and Wikidata are public read-only APIs with deterministic outputs. JSON-LD extract from a homepage is a local parse, no network variability. A check is VERIFIABLE when re-running it five minutes later returns the same answer absent a real change. Mention counts and review-platform presence (Phase 2) are BEST-EFFORT because rate limits and search-result instability mean two consecutive runs can disagree.
Why does sameAs require four of six platforms instead of any one?
Single-platform sameAs is trivial to fake. Four-of-six requires that the operator actually maintains a multi-platform presence linked back to the canonical entity. Four was the threshold below which the signal stops correlating with AI citation in the audits I have seen so far. The bar will move once Phase 2 mention counts ship and the joint distribution becomes visible.
Will I get an AI citation just because I pass Section 5?
No. Section 5 measures the entity-recognition layer. Citation also depends on Sections 1 (SEO Foundation), 2 (AI Infrastructure), and on whether the AI systems’ next training run includes you. Section 5 PASS removes a hard ceiling. It does not guarantee what happens above the ceiling.
Can I run this on a competitor’s domain?
Yes. The audit is read-only and uses no authenticated APIs. Run it on whatever domain you want to baseline. The chudi.dev / github.com results in this post were generated exactly that way.
When does Phase 2 ship?
When Phase 1 has been used on at least 30 real audits and the false-positive / false-negative rates are characterized. The build estimate is 2 days, the validation budget is the gating constraint. No date promise.
Related
- Framework canonical: chudi.dev/framework
- Audit tool: citability.dev
- Patel signal hierarchy (research foundation): cited at chudi.dev/framework
Decision id for this build: dc-20260513T180052Z-9409. Ratified 2026-05-13 in a /grill + /librarian session, downgraded by /red-team to Inferred-with-DRIFT-flag. Phase 1 alone is the smallest correct ship.
Sources & Further Reading
Further Reading
- Entity Optimization for Brands in AI Search Rank is a single-page game. Entity coherence is the compounding game. How sub-DR-20 brands engineer a Person + Organization graph that AI search engines actually cite.
- Perplexity vs ChatGPT: Different Citation Rules Perplexity quotes liberally. ChatGPT quotes selectively. The engine-level differences in citation behavior that change what a sub-DR-20 brand should optimize for, engine by engine.
- I Audited 7 Websites for AI Citability. Here Is What Actually Predicts Citations. Audit data from 7 websites shows domain authority does not predict AI citations. DA-10 sites outperform DA-92 sites. Here is what actually matters.
What do you think?
I post about this stuff on LinkedIn every day and the conversations there are great. If this post sparked a thought, I'd love to hear it.
Discuss on LinkedIn