TRANSPARENCY

Methodology

How we measure medspa visibility across ChatGPT, Claude, Gemini, Google AI Overviews, and Perplexity: the five surfaces patients actually use when they ask an AI for a recommendation.

1 · What we measure

Visibility is the share of patient-intent queries where an AI surface names your medspa in its answer. We measure it across exactly five surfaces:

  • ChatGPT (OpenAI): measured against ChatGPT’s default model with web tools enabled.
  • Claude (Anthropic): measured against Claude’s default model with web search enabled.
  • Gemini (Google): measured against gemini.google.com’s default model.
  • Google AI Overviews: the AI-generated summary block at the top of Google SERPs, captured separately from Gemini because the surface, ranking factors, and citation behavior differ.
  • Perplexity: a web-search answer engine (Sonar). It leans heavily on Reddit and other UGC for local queries, a different citation profile from the others.

Perplexity was added as the fifth audited surface in June 2026, once it became a primary answer engine patients use for local discovery.

The audit

See yourself the way AI sees you

Patients no longer scroll ten blue links. They ask ChatGPT, Claude, Gemini, Google AI Overviews, or Perplexity for a recommendation, and they act on the names that come back. The audit shows you exactly which names those are.

We run 31 patient-intent query types across all five surfaces, then hand you a single visibility index and the five changes that move it. No 40-item checklist: just the work that shifts the score.

2 · The query basket

Every audit runs 31 patient query types per medspa, expanded into 140 query-runner combinations to account for variance in how AI surfaces respond to phrasing and context. The 31 types cluster into six intents:

Those two numbers measure different things, so they never tally directly: the 140 combinations are the 31 questions run across the four chat surfaces (124) plus 16 cost and consideration questions routed only to Google AI Overview. When a report later cites a smaller figure like 28 unique questions, that is the deduplicated count of distinct patient questions actually measured for that medspa, the unit we use for the visibility rate so the same question is never double-counted across surfaces.

  • Treatment-led: “best botox in Sunnyvale”, “best lip filler in Sunnyvale”.
  • Location-led: “Sunnyvale medical spa”, “Sunnyvale aesthetic clinic”.
  • Comparative: “[your medspa] vs [rival]”, “[brand] vs [brand]”.
  • Trust-led: “safest place for botox”, “most reviewed medspa”.
  • Cost-led: “how much does botox cost in Sunnyvale”, “lip filler pricing in Sunnyvale”.
  • Research-led: “how to choose a medspa in Sunnyvale”, “is medical weight loss worth it”.

Cost and consideration run per treatment, not just in aggregate: “how much does lip filler cost in Sunnyvale” and “is botox worth it and where to get it” are their own queries. A clinic that ranks for “best botox” can still be missing where a patient checks the price or weighs whether it is worth it, which is where the booking is decided.

Each query is run against each of the five surfaces. Runners vary the phrasing slightly to surface LLM variance: the same intent expressed three ways often returns three different brand sets.

3 · Scoring & dedup

For each (query × surface) combination we record three things:

  • Mention: was the medspa named in the AI’s answer?
  • Citation: was the medspa’s URL or property cited as a source?
  • Rank position: ordinal position among named competitors (1st, 2nd, 3rd…).

Before counting, we collapse variant spellings and sub-locations of the same chain. “Skin Refine”, “Skin Refine Medspa”, and “Skin Refine: Sunnyvale” count as one brand. We do not collapse separately-named clinics under the same owner.

The per-surface visibility score is a 0–100 number reflecting mention rate weighted by rank position, normalized so that showing up first in every query equals 100 and showing up never equals 0.

Five surfaces

One basket of questions. Five answer engines.

Every audit runs the same 31 query types against ChatGPT, Claude, Gemini, Google AI Overviews, and Perplexity: 140 query-runner combinations in total, because the same intent phrased three ways often returns three different brand sets.

Google AI Overviews is scored separately from Gemini on purpose: the surfaces, ranking factors, and citation behavior differ, and folding them together would smooth over real visibility gaps.

ChatGPTClaudeGeminiGoogle AI OverviewsPerplexity

4 · What you receive

Within 24 hours of intake, we walk a PDF to your front desk containing:

  • Visibility score per surface and a combined index across all five.
  • Side-by-side competitor table: the top 3 named competitors per query intent.
  • Ranked fixes, ordered by projected lift. We don’t hand you a 40-item checklist; we hand you the 5 changes that move the score.
  • A 30-day projection if recommended fixes are implemented.

The deliverable

Walked to your front desk in 24 hours

Within 24 hours of intake you receive a PDF: visibility scored per surface, a combined index across all five, and a side-by-side table of the competitors AI names ahead of you.

Then the ranked fixes, ordered by projected lift, with a 30-day projection if they're implemented. We don't email it and hope you read it. We walk it to your front desk.

5 · Why these five surfaces

These are the five surfaces where a real patient, today, can ask a question and receive a direct recommendation that names specific businesses. Search index pages and review aggregators are downstream of these: patients are increasingly stopping at the AI answer.

Google AI Overviews is listed separately from Gemini because they are functionally different products: Overviews is embedded in Google SERPs and pulls from a different citation pipeline than gemini.google.com’s standalone interface. Treating them as one would smooth over real visibility gaps.

6 · Limits & what we don’t claim

  • LLM outputs are stochastic. We run multiple combos per query type to smooth this, but two audits a week apart can disagree at the margin on lower-scoring brands.
  • The audit is a point-in-time snapshot. AI surfaces update continuously; we re-score monthly for active clients.
  • The 30% visibility-increase commitment refers to the number of unique patient questions where at least one of the five surfaces names you, measured at the 90-day mark against the baseline audit on the same query basket (unique questions as the unit, not question-and-surface combinations). Where your baseline names you in fewer than 5 unique questions (including zero), the commitment is instead an absolute increase of at least 3 additional unique questions where an AI surface names you, since a percentage of so small a base is not a meaningful measure.
  • We do not guarantee position-1 placement on any individual query. We optimize for the underlying signals that AI surfaces weight, not for any single query result.

Questions about the methodology?

We’re happy to walk through any of this in detail before you commit. Email Jonah@sunnyvaleaeo.com or request your audit at the intake form.