Beyond Clicks: Tracking AI Visibility Metrics

The analytics stack most marketers rely on was built for a world where visibility meant clicks. Someone searched, saw your listing, clicked through, and you measured the visit. That model is breaking.

When a user asks ChatGPT, Perplexity, or Google's AI Overview a question, they often get the answer without ever visiting a source. Rand Fishkin's analysis of zero-click behavior found that 60% of marketers still prioritize traffic growth as their top goal — yet every major referral source is sending fewer clicks year over year. A Pew Research study confirmed that users who encounter AI summaries in Google results are measurably less likely to click on links to other websites.

In this environment, being retrieved, named, and recalled consistently by AI models can be more valuable than pageviews. But you cannot optimize what you cannot measure. The industry needs a new visibility stack — one designed for AI-mediated discovery.

Three Layers of AI Visibility

Traditional SEO metrics collapse visibility into a single funnel: impressions, clicks, sessions. AI visibility requires a different decomposition. Think of it as three layers, each capturing a distinct signal.

Layer 1: Retrieval Presence

The question: Are you in the candidate set?

Before an LLM can cite you, it has to retrieve you. In retrieval-augmented generation (RAG) architectures — the approach described in the foundational Lewis et al. paper that underpins most production AI search systems — the model first queries a document index, pulls candidate passages, then generates its response using those passages as context.

The metric: Retrieval Presence (RP) — the percentage of runs where your entity or URL appears in the model's retrieved sources, even if not cited in the final output.

How to measure it: Run controlled queries across target AI platforms (ChatGPT, Perplexity, Gemini, Claude, Copilot). Capture the model's "sources," "about this answer," or "references" sections. Scrape inline attributions. Log when your brand or domain is mentioned versus merely used as background context. Google's own documentation notes that pages must be indexed and eligible for snippets to appear in AI features — but being eligible and being retrieved are different things.

Platforms like Otterly.AI and Profound have begun building monitoring dashboards that automate this capture across multiple AI platforms, tracking brand mentions and domain citations daily.

RP is the widest net. It tells you whether the AI even considers you a relevant source. Without presence in the candidate set, nothing else matters.

Layer 2: Citation Depth

The question: How explicitly are you credited?

Not all citations are equal. Being named as a source with a direct URL is categorically different from having your content silently paraphrased. Citation Depth captures this distinction with a weighted score per run:

2 points — Named entity + direct URL (e.g., "According to Acme Corp (acme.com/report)...")
1 point — Named entity only (e.g., "Acme Corp's research shows...")
0.5 points — URL only, no name (a bare link in the sources list)
0 points — Silent use (your content informed the answer but you are not credited)

Beyond the base score, track four dimensions:

Anchor type. Is the model linking to your homepage or a deep page? Deep-page citations signal that the model is engaging with specific content, not just recognizing your brand.

Anchor text. Does the citation use your brand name or a generic descriptor? Brand-name anchors carry stronger association signals for future retrieval.

Position. Are you cited at the top of the response or buried in a footnote? Top-of-response citations get more user attention and reinforce brand recall.

Plurality. Are you the sole cited source or one of many? Being the only source cited for a claim is a stronger signal than being listed alongside five competitors.

Seer Interactive's CTR research found that brands cited in Google's AI Overviews see 35% higher organic click-through rates and 91% higher paid CTR versus uncited competitors. Citation depth is not a vanity metric — it has direct downstream traffic consequences.

Layer 3: Recall Consistency

The question: Does the model remember you across prompts, variants, and time?

A single citation is noise. Consistent citation is signal. Recall Consistency (RC) measures whether the model reliably associates your brand with a topic across variations.

The metric: A Jaccard-style stability coefficient. Take the set of runs where you appear, group them by prompt paraphrase, user intent (discover, validate, inform, convert), and time window (days, weeks). Calculate the intersection over union across these groups. High RC means you show up regardless of how the question is phrased or when it is asked.

Bonus metric: First-mention rate — the percentage of runs where you are the first brand cited. First mention correlates with strongest association in the model's latent space. If you are consistently the first name the model reaches for, you own that topic in AI-mediated discovery.

RC matters because LLM responses are non-deterministic. The same prompt can produce different source selections across runs. Brands that appear sporadically have weak entity embeddings. Brands that appear consistently have been encoded as authoritative on the topic.

The Composite: AI Visibility Index

These three layers combine into a single composite metric:

AIVI = 0.4 · RP + 0.4 · CD + 0.2 · RC

The default weights reflect a balanced visibility goal. But you should tune them to your strategic priority:

Discovery-focused brands (new entrants, category creators): weight RP more heavily. You need to be in the candidate set before anything else matters.
Thought leadership brands (established players defending position): weight CD more heavily. You are already being retrieved — the question is whether you are being credited prominently.
Consistency-focused brands (regulated industries, trust-dependent verticals): weight RC more heavily. Sporadic visibility is a reputational risk.

From Metric to Action

Measuring AI visibility is useful only if it changes what you do. Here is how each layer maps to tactical response:

Low RP: The model is not retrieving you. Focus on content that matches the query patterns AI systems use. Ahrefs' updated research shows AI Overviews reduce position-one CTR by 58% — but to lose clicks, you first need to be in the game. Ensure your content is crawlable, well-structured, and semantically clear. Structured data with strong entity identifiers helps retrieval systems resolve who you are.

Low CD, decent RP: You are being retrieved but not credited. This is a content authority problem. The model sees your content but attributes the knowledge elsewhere. Strengthen your entity signals: named author bylines, clear organizational attribution, sameAs links to Wikidata and authoritative registries, and explicit claims that are structured for extraction.

Low RC, decent CD: You are cited sometimes but not consistently. This suggests thin topical coverage. The model finds you for some phrasings but not others. Build content depth across the full range of query intents for your topic. Cover the discover, validate, inform, and convert angles so the model encounters you regardless of how the user frames the question.

What This Stack Replaces

The AIVI framework does not replace traditional web analytics. It sits alongside it. You still need to track organic sessions, conversion rates, and revenue attribution. But you also need to track what happens before the click — and increasingly, instead of the click.

The old stack measured: Did the user visit your site?

The new stack measures: Does the AI know who you are, credit you by name, and remember you next time?

Both matter. But only one is growing in importance.

James Calder is the editor of The Search Signal, covering AI-powered search, generative engine optimization, and the future of brand discovery.

Publisher Traffic Is Collapsing. Here Is Why GEO Practitioners Should Care.

Google's AI-Rewritten Headlines Are More Consequential Than They Look

AI Search Optimization Is a Channel Shift, Not a Subculture