A practical lesson in how AI “knows” your company even when Wikipedia is missing.
Executive summary (for busy leaders)
- We ran a 1,000‑entity benchmark across 17 chatbots / large language models (LLMs) using LLMtel-style scoring.
- In the Top 25 scoring entities in one of the segments, 24 out of 25 (96%) had a Wikipedia page.
- One Top‑25 entity did not: Australian Payroll Association (APA).
- In the Top/Bottom 25 slice, “Has a Wikipedia page” and “In the Top 25” were strongly linked (correlation = 0.806, where 1.0 is a perfect link and 0 is no link).
- The outlier matters because it shows a second path: when Wikipedia is missing, other trust signals can stand in if they are strong, consistent, and easy for AI systems to learn.
Why this matters now
A few years ago, the big question was, “Do we show up on Google?”
Now it’s also, “Do we show up in AI answers?”
Your customers, employees, investors, and partners are already asking AI systems things like:
- “Who are the top providers in X?”
- “What does this organization do?”
- “Is this company legit?”
If an AI system can’t place your organization confidently, it may:
- Leave you out of shortlists and recommendations, or
- Guess (and sometimes guess wrong).
That’s why AI visibility is becoming both a brand risk and a growth lever.
What we measured (two scores, two different behaviors)
This study separates something most people mix together:
1) “Do LLMs know you?”
That’s the Entity Score: out of 17 models, how many recognized the entity when asked directly.
2) “Do LLMs mention you?”
That’s the Questions Score: when models answer real questions, how often do they bring up the entity in their response?
This difference matters because “known” and “named” are not the same. A company can be recognized, but still not recommended.
The Wikipedia pattern in the Top vs Bottom 25
We took the Top 25 and Bottom 25 scoring entities from our 1,000‑entity report where the brand was known in all cases, but the entity may or not show up in the answers given. We then checked whether each entity had a Wikipedia page.
Here’s what we found:
| Group | Has Wikipedia | No Wikipedia | Total |
|---|---|---|---|
| Top 25 | 24 | 1 | 25 |
| Bottom 25 | 4 | 21 | 25 |
Two quick reads:
- Wikipedia pages showed up in 96% of the Top 25 (24/25).
- Wikipedia pages showed up in only 16% of the Bottom 25 (4/25).
In plain terms: in this slice, having a Wikipedia page was a huge advantage. Entities with Wikipedia pages were about 19× more likely to land in the Top 25 than entities without one.
And the relationship is not subtle. The correlation was 0.806, which is very strong for real-world business data.
Meet the outlier: Australian Payroll Association (APA)
Now the interesting part.
Despite the strong Wikipedia effect, one Top‑25 entity didn’t have a Wikipedia page in our check:
- Australian Payroll Association (APA)
This is more than trivia. It’s a strategy clue.
Because Wikipedia usually acts like a default “ID card” for AI systems:
- Clear definition
- Consistent naming
- Lots of links and citations
- Strong connections to other known topics
So when an entity breaks into the Top 25 without Wikipedia, it suggests something else is doing the job.
How you can win without Wikipedia: the “Wikipedia substitute” signals
e didn’t need Wikipedia to see the pattern. But Wikipedia helps explain why the pattern exists.
Wikipedia is powerful because it bundles several trust signals into one place.
If you don’t have that bundle, you can still build the pieces elsewhere.
Here are the substitute signals that most often replace Wikipedia in practice:
1) Third‑party authority mentions (hard to fake)
Examples:
- Government or regulator pages
- Standards bodies
- Universities, research orgs, credible nonprofits
- Professional oversight groups
Why it works: AI systems learn to trust sources that don’t sound like marketing.
2) High-quality industry directories (not just any directory)
The directories that matter are:
- Curated or verified
- Respected in the industry
- Referenced by other authoritative sites
Why it works: Directories create clean lists and categories formats AI learns well.
3) Independent media and trade press coverage
What helps most:
- Profiles and explainers (not just press releases)
- Consistent reporting across multiple outlets over time
- Credible citations that others reference
Why it works: Repeated, independent coverage builds “public record.”
4) Clear, structured identity data
If AI sees your name in five different forms, it may treat them like five different entities.
Helpful signals:
- Consistent “About” pages and headings
- Schema.org Organization markup on your website
- Stable metadata across trusted platforms
- Consistent use of the acronym (if you use one)
Why it works: Machines love consistency. It reduces confusion and boosts confidence.
5) “Reference‑grade” content (built to be cited)
Think: content that looks like it belongs in a handbook, not an ad.
Examples:
- Reports, standards, benchmarks
- Glossaries, definitions, “how it works” pages
- Public data and methodology pages
Why it works: LLMs pull from material that reads like source material.
Important: I’m not claiming which specific sources explain APA’s result. The point is the pattern: entities can sometimes replace Wikipedia with a strong mix of these signals.
What this means for C‑level leaders
If you can have a Wikipedia page
Wikipedia can be a major accelerator, because it centralizes trust signals.
But it’s not a marketing channel. It’s governed by notability and sourcing rules. Trying to “force” it usually backfires.
If you can’t (or shouldn’t) have a Wikipedia page
You still have a path one that’s often more controllable:
- Standardize your identity (one name, one acronym, one description)
- Earn third‑party proof (authority anchors and credible listings)
- Create reference-grade assets people cite
- Add structured data so machines don’t misread you
- Measure ‘known’ vs ‘named’ over time across multiple LLMs using a tool like LLMtel.com.
If you do this well, you can build the same kind of credibility Wikipedia provides just distributed across the web.
The bottom line
Wikipedia is the fast lane to AI visibility.
But APA shows there’s another route:
strong third‑party validation + consistent identity + clear structure.
If your organization can’t rely on Wikipedia, the goal is simple:
Be easy to verify. Be easy to name.
Appendix A: “Known vs Named” matrix stats (from the 1,000‑entity benchmark)
This matrix separates two realities:
- Known = recognized by at least one of the 17 LLMs
- Named = appeared in answers at least once
| Named in answers | Never named | Total | |
|---|---|---|---|
| Known | 617 | 319 | 936 |
| Unknown | 28 | 36 | 64 |
| Total | 645 | 355 | 1000 |
Quick interpretation:
- 319 entities were known but never named (recognized, but not surfaced).
- 28 entities were unknown but still named at least once (mentioned, even without solid recognition).
Appendix B: Checklist – “Non‑Wikipedia Visibility Audit”
Use this when you can’t rely on Wikipedia (or you want less dependence on it). Mark Yes / No.
A) Canonical identity (reduce confusion)
- One official name used everywhere (legal + brand)
- One official short name and one official acronym (if any)
- One consistent one‑sentence description across platforms
- About page clearly distinguishes us from similarly named entities
- Same spelling and punctuation across major listings
B) Authority anchors (hard trust)
- Listed/mentioned by at least one regulator, government body, standards org, or university (where relevant)
- Those mentions use our canonical name
- Those mentions link to our official domain (or a stable profile)
C) Credible directories (structured lists AI learns)
- Listed in respected directories with verification/editorial control
- Listing includes clear description + category + location (if relevant)
- Listing links to our official domain
D) Independent coverage (third‑party narrative)
- Covered by reputable trade press or mainstream outlets
- Coverage uses our canonical name and describes us accurately
- Coverage is consistent over time (not a one-off)
E) Reference‑grade content (content AI trusts)
- At least one plain-language explainer page that defines what we do
- At least one report/standard/educational asset others can cite
- Clean newsroom/resources page with dated updates
F) Structured data (machine-readable clarity)
- Website uses schema.org Organization markup (or equivalent)
- Key fields present: name, logo, URL, sameAs links, location, parent/child org (if relevant)
- Social profiles and listings point to the same official domain
G) Monitoring and testing (prove it works)
- Regular LLM visibility checks across multiple models using LLMtel.com
- Test name variants to detect a “variant penalty”
- Track two KPIs separately: recognized vs named