logo BOOK A GROWTH CONSULTATION
How AI Decides Who Gets Cited

Strivenn Thinking

Pull up a seat at our
digital campfire
where story, strategy,
and AI spark
new possibilities
for sharper brands
and smarter teams.

 

Marketing Strategy

How AI Decides Who Gets Cited

By Matt Wilkinson

You have spent years building domain authority. You rank on page one for three of your most important keywords. But when a researcher asks ChatGPT which antibody suppliers are worth considering for their assay development workflow, your brand may not appear. Not because your content is poor. Because AI citation is not an extension of search ranking.


The two systems look at different signals, weight them differently, and serve different purposes. Understanding this distinction is the most important shift in how you think about your content programme this year.


Brand recognition beats backlinks

The most counterintuitive finding from recent research into AI citation behaviour comes from a 7,000-citation analysis published in late 2025. The strongest predictor of being cited by large language models was not domain authority, not inbound link count, not content freshness. It was brand search volume: how frequently people actively search for a brand by name.


The correlation was 0.334 - measurable and consistent. Backlinks - the cornerstone of traditional SEO authority - showed weak to neutral correlation with AI citation. Domain authority score showed similar results.


This means recognition compounds citation - and citation compounds recognition. The brands already being sought out by name are the brands AI surfaces, which increases their visibility, which increases the searches. The feedback loop favours whoever gets in early. And it is the mechanism that makes Citation Compression accelerate: once a small number of brands dominate AI recall for a category, search volume concentrates on them, which strengthens their citation dominance further.


For life science tools companies, this has a specific implication. If your brand is well known in one geography or one research community but invisible elsewhere, that gap shows up in AI citation patterns. Brand building is not soft. It is the structural prerequisite for AI discoverability.


The entity consistency problem

Sixty percent of ChatGPT queries are answered from parametric knowledge: what the model already knows from training, without any live web search. That means your latest blog post is irrelevant for those queries. What matters is what the model learned about your company when it was trained - and whether that picture was coherent.


If your company description on your website says one thing, your LinkedIn page says another, your Crunchbase entry says something different again, and your Wikidata entry is blank - you have an entity consistency problem. AI models assemble their understanding of organisations from multiple sources. Inconsistency reads as ambiguity. Ambiguity reduces citation confidence.


Entity-driven content improved AI citation likelihood by more than 35% across major AI search platforms, according to Search Engine Land analysis from October 2025. The target is a coherent, consistent identity across at least four platforms: website, LinkedIn, Crunchbase, and Wikidata.


Strivenn's exhibitor surveys at both ELRIG Drug Discovery 2025 and SLAS 2026 found data quality cited as the primary barrier to AI adoption by 44% of respondents - unchanged across three months and two geographies. This is not a coincidence. It is the same underlying problem. Fragmented internal data produces fragmented external identity. If your organisation cannot agree on how to describe what it does, AI cannot either. Entity consistency starts with the same discipline that fixes data quality: one source of truth, maintained by someone who owns it. Without that foundation, optimisation is cosmetic.


Structure determines extractability

AI models do not read your content the way a human does. They extract it. And they extract specific kinds of content far more reliably than others.


Self-contained information chunks of 50-150 words earn citations at approximately 2.3 times the rate of flowing narrative prose. Pages with embedded statistics show a 37% increase in citation rates. The Princeton GEO study, published at KDD 2024, found that adding citations to content yields a 115% visibility increase for challenger brands entering established categories. Structure is not a design choice. It is a citation mechanic.


This is practical and actionable. Look at your highest-value product pages. Are they written as flowing marketing copy? Rewrite the most important claims as self-contained paragraphs. Add the specific data point - "increases yield by 40% compared to standard protocols, validated across 12 independent laboratories" - rather than the category claim - "industry-leading performance." The former is extractable. The latter is noise.


FAQ schema appears in only 10.5% of AI-cited pages despite direct alignment with how AI models process question-based queries. Organisation schema, Author/Person schema for named experts, and Article schema for technical content are all underdeployed by life science companies. Pages using three or more schema types show measurably higher citation rates. Schema implementation can be completed in a single sprint. It is not sophisticated work. It is missing work.


Platform presence multiplies citation probability

Your buyers are using different AI platforms. And those platforms are pulling from very different sources. ChatGPT and Perplexity overlap on only 11% of the domains they cite. Google AI Overviews draw from organic search results in 93% of responses - but reference Reddit in 21%. Perplexity runs on real-time retrieval across an index of more than 200 billion URLs. Each platform has its own citation logic, its own source preferences, and its own version of your category. Assuming visibility on one means visibility on all is a dangerous assumption to make.


Presence isn't about volume - it's about coverage. Brands present across four or more relevant platforms are cited at approximately 2.8 times the rate of brands visible on one or two. That gap is not closed by creating thin accounts everywhere. It's closed by ensuring your company is accurately and substantively represented on the specific platforms that feed the models your buyers actually use.


For most life science companies, that means five priorities: your website, LinkedIn, Crunchbase, Wikidata, and the industry publications where your researchers publish or are quoted. Not because those platforms are universally important - but because they are disproportionately important to the models your buyers are querying right now.


Get those five right before you think about anything else.


The counter-argument worth raising

Some of this will feel like a lot of effort for uncertain return. AI referral traffic is still less than 1% of total website traffic for most B2B companies. Traditional search still drives the pipeline. The ROI case is directional, not proven. If you're sitting across from a CFO asking for hard numbers, this is a difficult investment to justify on current returns alone.


That argument is correct about today. It's wrong about what today means.


The brands building AI visibility now are doing so in a relatively uncrowded space, at moderate cost, against competitors who haven't started yet. The brands that begin this work in two years will enter a landscape that's already stratified - where citation patterns are established, source preferences are baked in, and the gap between visible and invisible companies has had time to compound. Early search engine optimisation didn't look like a priority until it did. Then it looked like a missed window. Citation Compression does not wait.


Where to start

Three actions with disproportionate impact:

  • Run an entity audit across your website, LinkedIn, Crunchbase, and Wikidata - score consistency and fix the gaps

  • Restructure your five highest-value commercial pages around self-contained, statistics-led paragraphs

  • Build public author profiles for your three most credible named experts.


Before you do any of that: run the test. Open ChatGPT, Perplexity, and Google AI Mode. Ask each one to name the top three companies in your specific product category. At SLAS 2026, 62% of life science exhibitors had never done this. Among those who had, 75% found themselves listed. The result tells you whether you are in the Unconsidered Set. That answer is your baseline. Everything else follows from knowing where you stand.


Your domain authority took years to build. Your AI visibility can start improving this quarter.

 

 


 

To learn more, visit the AI Discoverability Content Hub