S2 Ep11: Six Steps to AI Citation Success
By Matt Wilkinson
If your brand isn't cited by AI tools when buyers search your category, you're not losing a ranking - you're being skipped entirely before the decision is made.
Shownotes
Open ChatGPT, Perplexity, or Google AI Mode right now and type: "Name the top three companies in [your product category]." If your brand isn't in the answer, you're not losing a click. You're not even in the race.
This episode is for life science marketers and commercial leaders who want to understand - and act on - AI discoverability before their competitors do.
Matt Wilkinson and Jasmine Gruia-Gray debate the six-phase AI discoverability audit: a structured, agency-free framework designed to help life science brands earn citation presence in AI-generated answers. The conversation moves fast, challenges assumptions, and gets into the practical friction points that most audit frameworks gloss over - from entity consistency and ungating strategy to proxy pages and measurement cadence.
The key idea: if your brand isn't cited by AI tools when buyers search your category, you're not losing a ranking - you're being skipped entirely before the decision is made.
- Why entity consistency across platforms is the foundation of AI discoverability - and why mismatched descriptions confuse AI models at an identity level
- The real debate between fixing entity consistency first versus ungating content - and why the sequencing matters more than you think
- How proxy pages work as a halfway house between gated content and full discoverability - and where that trade-off breaks down
- Why 50 manual queries every 90 days is more achievable than it sounds - and the hidden bias problem that can corrupt your results
- How AI models are building personalised understanding of individual users - and what that means for running clean discoverability tests
- Why getting your positioning stable now matters: dual-use bots are feeding your content back into model training runs, so consistency compounds
Keywords: AI discoverability, life science marketing, AI citation, large language models, SEO for AI, entity consistency, gated content strategy, proxy pages, AI search, content ungating, life science SEO, AI marketing audit
Transcript
The following is the full transcript of this episode of A Splice of Life Science Marketing. Matt Wilkinson and Jasmine Gruia-Gray debate the six-step AI discoverability audit - working through entity consistency, content accessibility, proxy pages, and measurement cadence with the practical rigour that life science marketing teams actually need.
The unconsidered set: a live test for your brand
Speaker: Jasmine [00:06]
Happy World Meteorological Day. Well, that's hard to say.
Speaker: Matt Wilkinson [00:11]
Hi Jasmine, I didn't even know it was World Meteorological Day. Does that mean we're supposed to have good weather today?
Speaker: Jasmine [00:16]
Yeah, if only, if only. Yeah, I had no idea either but just happened upon the information. Apparently this was created in 1950 as a way to draw more attention to weather and to climate and to climate changes.
Speaker: Matt Wilkinson [00:40]
Well, it's a perfect time in spring where we've definitely got four seasons in a day at times.
Speaker: Jasmine [00:52]
So should we get started? So I wanted to do something right now. I want everybody listening to pull up ChatGPT, Perplexity, or Google AI Mode and type the following: name the top three companies in your category, your product category. So let's say your product category is peptide synthesizers. You would type in: name the top three companies in peptide synthesizers. If your brand is not in the answer, you're not losing a ranking. This isn't SEO. You're not losing a click. You're being skipped entirely before the researcher even considers their options. That's what you call the unconsidered set.
And across Strivenn's exhibitor surveys at SLAS in February this year, 62% of life sciences companies had never even run that test. So today we're going to debate your six-step AI discoverability audit. You argue it gives life sciences marketing teams a clear measurable path to AI citation presence - no agency required. And I think some of those phases are more straightforward than they look on paper, at least to me. And I'm forewarning, I'm going to push back on the ones that aren't. Ready?
Speaker: Matt Wilkinson [02:32]
For sure - some of them are definitely harder than others.
Step one: entity consistency as identity, not housekeeping
Speaker: Jasmine [02:36]
So your audit starts with entity consistency - making sure your company description matches across six or however many platforms - and that sounds like a cleaning exercise. Why is that the foundation rather than, say, getting your best content ungated when AI can actually read it?
Speaker: Matt Wilkinson [03:04]
So it's definitely a cleaning exercise, but the AI systems fail on this so often, and it's not a cosmetic problem. It's really an identity problem. And the reason I think we have to start there is because it cascades across everything. AI models don't discover your company the way a search engine crawls a page. They build an understanding of who you are by aggregating signals across authoritative sources. So your LinkedIn company page, Crunchbase profile, what you might have written about you on Select Science, or your Bio Compare pages - those are all identity signal pages. And if they're different, the AI doesn't know how to categorise you.
And it's really interesting that over time, when we're talking about that sort of branding and positioning - so fundamental marketing practice of who are we, what do we stand for, what is the value that we deliver to our customers - if that is confused, the AI gets really, really confused. And so that's why it's so important to start there and make sure that the AI can understand what it is you stand for.
Speaker: Jasmine [04:23]
Okay, so I hear the logic and the attribution risk being real, but here's where I want to push back a bit. For most life science companies I've worked with, entity consistency is potentially achievable in a week. Your website, you have control over that. Your LinkedIn page, you have control over that. Crunchbase - those are all editable. Things like Select Science, a little less so. Bio Compare maybe takes a bit longer. Wikidata takes a day to learn. This isn't a six-week project. It's a focused afternoon with the right person.
If the audit positions entity consistency as step one out of six, it risks giving marketing teams permission to spend three weeks polishing their Crunchbase entry while their best scientific content sits behind a gate that AI can't read. The content accessibility problem in step two is structurally harder and strategically more important for most brands in our sector. A company with a clean entity profile and all its authoritative data gated and locked away is less visible than a company with a slightly inconsistent profile and three ungated pieces of content. I would argue step one and step two should run in parallel, not sequentially. The sequencing creates a false sense of progress on the easier problem.
Speaker: Matt Wilkinson [06:14]
So I think you could probably run all six steps in parallel if you have the bandwidth. And while it's - if somebody has access to all of the platforms and the login details, it's perfectly reasonable. You could get that done in a day. If you know what you want to be known for and if that's agreed. So part of the problem here is making sure that you have alignment on what do you want to be known for? That's really crucial - and that you've optimised that phraseology, not just for the humans that you want to influence, but also for the large language models that we want to influence, because the proximity of words next to your brand name is really, really important. So making sure that you haven't got a lot of fluff, that what it is you deliver is clear. That's kind of crucial so that the large language models really understand who you are, what you stand for very, very easily.
The other challenge that I've encountered, even in small firms, is getting access to login details. So any large organisation, somebody might have set up the Crunchbase. Who knows who owns Crunchbase? How do we find that out? Who owns all of these different historical platform entries, if you will. And making sure that you've actually got to that point is probably a housekeeping process that some organisations will have down. Absolutely. Others, that's going to be a real challenge. So it sounds simple on paper and for many organisations it will be - for others, it's going to be harder.
And I think it's really just one of those things where - there's a sequence of steps - getting agreement on what to put in these is probably the harder thing to do. What is it that we want the large language models to know us for? That's a conversation in and of itself where everybody's still learning exactly how to make sure that the words that you use are the ones that people are going to be searching for and asking questions about.
Proxy pages: halfway house or broken promise?
Speaker: Jasmine [08:30]
Yeah, I think that's the strategic bullseye, I would agree there. For the proxy page approach, you suggest that it's the audit's answer to gated content. You write a structured summary, add schema markup, and let it earn the citation while the gate still captures the lead. But if AI models can tell the difference between a summary page and the primary source, are you not just optimising for a citation that carries less credibility than the real thing?
Speaker: Matt Wilkinson [09:12]
So that's a great question. I'm actually of the belief that we need to ungate the vast majority of things that we have. And while I still believe that every now and again, there is a need for some gated content, we should be absolutely providing these proxy pages where the value is clearly communicated both to the human and to the AI, so the large language model can learn about it.
One of the things I think I've been advocating for probably since last April when deep research came out and we realised that all of a sudden you could create your own buyer's guides - was that those fact sheets that are very, very pretty and have all of those technical specifications in them and all of that rich information that a buyer might want to use when considering your product or service - if that's hidden behind a gate or it's just stuck in a PDF, the large language models are not going to find that. So we really need to make sure that the number of tokens it takes for a large language model to access and understand your data is as low as possible. And so that's really what we're looking to do. In many organisations, going through and making a big change - like getting rid of the ability to gate content and therefore capture leads - is going to cause massive strategic marketing challenges. That's quite a big change in terms of how marketing is measured. If all of a sudden you lose one of your key sources of measurement, your key KPI, that's going to cause some nervousness. That's a bigger discussion. But if you're getting to a point where the large language model can at least understand what's behind that gate better, that's important.
Speaker: Matt Wilkinson [11:11]
So it's a halfway house that works from what we can tell so far. But is it the ultimate best-case scenario? The best-case scenario is make everything free to everybody and do such a good job that you don't need to collect names and have that contractual information. Here's some information. Give us your details. We're going to now nurture it to you. Just make the experience so good that people just can't say no to buying. In fact, they don't even have to be asked - they just come in and say, "Take my money."
Speaker: Jasmine [11:51]
Yeah, I don't know that I've had that experience in the life science tools industry, but you know, one can hope. So I get what you're saying from a mechanistic perspective. My concern is more one about expectations. So here's the scenario: a researcher who gets your proxy page cited in an AI answer and clicks through expects to read the full document, whether it's a publication or a buyer's guide, whatever it happens to be. When they hit a gate, the experience depends entirely on how much they want the data. For a known brand with an established reputation, they may likely fill in the form.
For a brand they've never heard of before - which is precisely the unconsidered set scenario - it's going to be, I think, a challenge. The gate is a trust barrier they're being asked to cross with no relationship. You've used AI visibility to get into the consideration set - good on you. Then the first meaningful interaction the researcher has with your brand is a form. That's a difficult moment and almost a broken trust. I'm not saying that the proxy page approach is wrong. I, in fact, have used it myself, especially when it comes to peer-reviewed publications that are also behind a gate or a subscription gate. It's probably the best available option for some gated content, but the audit presents it as a clean solution when it's actually, in my mind, more of a trade-off.
Speaker: Matt Wilkinson [13:47]
It's definitely a trade-off. I firmly believe, however, that these are two separate philosophical challenges. One is: should we be gating content? If so, how much? That's a big challenge for many organisations to change - changing how they approach that, changing how they therefore measure the effectiveness of marketing. So on one hand, we've got a big structural change in terms of what this means for marketing versus a structural change which is actually pretty minimal. It's an easy lift to make the case of: we just need to create a proxy page that's going to give a few bits of information and provide a richer experience to the large language model so we're part of that consideration.
And the thing is, if those gates exist already, we're not putting anything extra in the way. I'm definitely not advocating for gating more content. What I'm actually trying to say is: here's a halfway house between completely ungating and still having a gate in place, but making sure we're still able to do some marketing to the large language models and therefore the humans that are reading the outputs.
Speaker: Jasmine [15:07]
So would you advocate that marketers maybe should run a pilot for four months and take some currently gated content fully ungated, take some currently gated content and create a proxy page -
Speaker: Matt Wilkinson [15:30]
So it's not necessarily just the ungating the content. The gate typically stands as: here's a form, give us your information, and then we deliver your PDF. If you just remove the gate, you still have the button there or whatever the engagement mechanism is, and then you're still delivering the PDF. What we want to do is to elevate the key information out of the PDF and put it in something that is more machine-readable at scale by the large language models. Of course, large language models can read PDFs, but if they're just searching for information, we already know that if a page doesn't load quickly, they move on. So this is what can we do to reduce those barriers for the large language models? The gating of content itself - I'm a big advocate for ungating as much as possible, but that's a different philosophical question that can quite often involve an entire marketing department and rewrite entire marketing strategies and disrupt how the entire function is measured. So this is trying to be a practical and sensible halfway house that's still going to improve your AI discoverability without impacting the existing structure of how you go to market.
Measurement cadence: rigorous or theatre?
Speaker: Jasmine [16:56]
So the audit recommends querying five AI platforms with 10 questions per platform every 90 days and building a tracking log. That's 50 manual queries per cycle recorded and interpreted by a marketing team that more often than not is very stretched. Is that a realistic measurement programme or is it just producing data that looks rigorous and may go nowhere?
Speaker: Matt Wilkinson [17:32]
So 50 queries per 90-day cycle works out to be roughly one query per working day in that three months. That's pretty easy to achieve. Of course, you could do it much quicker with some tools. You might be able to automate those searches with others.
It's harder with some. We have to be careful that when we're running those searches, we're not overly biasing the results by using models in which we're working a lot - because those can influence the results. We know that if I already search a lot about a particular brand, let's just say I'm really interested in Apple for a while, and I ask lots of questions about different Apple products - if I then ask for something related that Apple sells, it's more likely to show me that something from Apple because it's already recognised that I'm interested in the brand. So we have to be quite careful about that. It's a bit like running searches on Google - depending on where you are, because of your location tags, you often get different results. Some because you're doing local SEO optimisation, others because you're optimising for the distributor in Germany rather than the UK or the US. So that's a really important point to be aware of. And so it's really about looking at how things are changing over time.
Speaker: Jasmine [19:08]
Would incognito mode help guard against that?
Speaker: Matt Wilkinson [19:14]
It's difficult because that tends not to save your searches. I think it's a pretty difficult one to really understand exactly how the memory is working and in every different model they do have different things. The only sure-fire way to know is to completely wipe the memory, which I know a lot of people are not going to want to do because they spend a lot of time tuning the large language models to know them well.
Speaker: Matt Wilkinson [19:43]
I ran a test recently - one of these online quizzes - and I got both the operator within ChatGPT and the Atlas browser to answer the quiz for me. And I got Claude in Chrome to do the same on the same quiz. They both answered almost exactly the same questions as I did going through that quiz. And I got exactly the same outputs as when I did it myself.
Speaker: Jasmine [20:09]
Hmm.
Speaker: Matt Wilkinson [20:10]
So any time we do a personality test or a quiz about how you appeal to things, our large language models are beginning to know us almost as well as we do. So that's a real challenge. And of course we can't influence that. So we're really looking at the native, uninfluenced models - which is of course, once we layer on all of our own memories and we start talking about brands and we know that we're interested in brands A, B, C and D, that's going to start influencing some of the results we might get.
Final verdict: go in with your eyes open
Speaker: Jasmine [20:49]
So this has been a fabulous discussion as always. I think the audit is the right structure. The six steps, clear diagnostics, scoring rubrics - you can run without any outside consultancy. My challenges were about sequencing, about being honest with teams on the proxy page trade-offs, about tightening the monitoring cadence. None of those are arguments against doing the audit. They're arguments for going in with your eyes open.
Speaker: Matt Wilkinson [21:28]
Let's be clear - this is just a starting point, not a guarantee. The science on how to measure AI discoverability is changing all the time. We're learning more and more about how these models work and how their searches work. And of course they're changing all the time as well. But certain things have become clearer and clearer as more bright minds have been put to task on trying to figure this out - and as more and more people use large language models as part of their decision-making process, as part of searching for information.
The importance of getting this right early on is really important. And I would stress that many of the bots are dual-use. So many of the bots that go out and perform web searches - when they find the information, they're not just finding information to deliver to you or me. They're also delivering that information from the web back to the repositories in which they then use to train the next versions of the models. So what we want to do is to get as much of our information into the models as quickly as possible, as consistently as possible, so that the large language models from training run to training run better know who we are - and they really understand that this is consistently who this company is, what they stand for, what they mean. So in some ways, this is almost speaking against the need to consistently rebrand and update your positioning. So once it's nailed, try and let the bots understand that this is stable - this is who you are and what you stand for.
Speaker: Jasmine [23:20]
So I'd like to end on the note of putting a challenge out to all of our listeners and to those who read our blogs. Please, run this test. It's so, so critical. And we'd love to hear from you. We'd love to hear what the answer was from this test and what you're planning on doing about it.
Speaker: Matt Wilkinson [23:44]
Yeah, absolutely. Keen to learn from all of you out there.