CiteGuard Fixes LLM Citation Attribution

Large language models fabricate citations at staggering rates. Studies peg the number between 78% and 90% of LLM-generated references as partially or entirely invented. That is not a rounding error. It is a structural failure — and one that directly undermines the trust layer AI systems need to function as credible information sources.

A new framework called CiteGuard, developed by researchers Yee Man Choi, Xuehang Guo, Yi R. Fung, and Qingyun Wang, offers one of the more promising fixes. It reframes citation checking as a retrieval problem rather than a generation problem — and in doing so, gets within striking distance of human-level accuracy.

The Problem with LLM-as-a-Judge

The default approach to verifying LLM citations has been to use another LLM to check the work. This sounds reasonable in theory. In practice, it fails badly.

CiteGuard's authors found that GPT-4o, when used as a citation evaluator, produced extremely low recall — just 16–17%. It could identify correct citations when it saw them (precision near 1.0), but it missed the vast majority of valid references. The evaluator was confidently wrong, which is worse than being obviously wrong.

This is the core insight behind CiteGuard: you cannot solve a retrieval problem with generation alone. You need to actually look things up.

How CiteGuard Works

CiteGuard is a retrieval-augmented agent framework. Instead of asking an LLM whether a citation looks right, it sends the agent to search for the actual papers and verify the match. The framework introduces six retrieval actions that dramatically expand what the verification agent can do:

Search by citation count or relevance — queries title and abstract fields across academic databases
Select — chooses papers from search results for deeper inspection
Find in text — searches full paper content for specific strings
Ask for more context — retrieves surrounding text (plus or minus three sentences) when an excerpt is ambiguous
Search text snippet — searches paper body content directly, bypassing PDF parsing failures
Iterative retrieval — excludes previously selected papers to avoid repetition

Two actions matter most. The "search text snippet" action lets CiteGuard find matches even when PDF access fails — a common problem with Semantic Scholar data. The "ask for more context" action allows the agent to gather surrounding text when initial excerpts are insufficient. An ablation study showed these two actions alone account for a 23.5 percentage-point improvement over the baseline.

The Numbers

CiteGuard was tested on the CiteME benchmark, which contains 130 computer science excerpts, each with a single missing citation. The results are striking:

The previous best approach (CiteAgent with GPT-4o) hit 35.3% accuracy
CiteGuard with GPT-4o reached 45.1% — a 17% improvement under identical settings
CiteGuard with DeepSeek-R1 reached 68.1% accuracy
Human performance on the same benchmark: 69.7%

That last comparison is the headline. CiteGuard with DeepSeek-R1 is within 1.6 percentage points of human accuracy on citation attribution. On easy-to-medium difficulty samples, the system actually exceeds 87% accuracy.

The framework also generalizes. A new cross-domain evaluation (CiteMulti) tested it on biomedical papers and long multi-citation paragraphs. CiteGuard outperformed the baseline across both categories, though absolute accuracy on these harder tasks remains lower — 34.4% overall — signaling room for improvement outside computer science.

Cost and Practicality

One of the underrated findings is cost efficiency. DeepSeek-R1, the best-performing model in the CiteGuard framework, costs approximately $0.005 per citation to run. GPT-4o costs $0.12 for the same task — 24 times more — while performing significantly worse. Google's Gemini 2.0 is essentially free but lands in the middle of the accuracy range.

This matters because citation verification at scale needs to be cheap. If you are building a system that generates hundreds or thousands of citations per day — as any serious AI writing tool does — the difference between half a cent and twelve cents per citation is the difference between viable and prohibitive.

Why This Matters for AI Search

Citation accuracy is not just an academic problem. It sits at the center of the trust crisis facing AI-generated content. When ChatGPT, Perplexity, or Google Gemini generate answers with citations, those references are the user's only anchor to reality. If they are fabricated, the entire output is unverifiable.

For publishers, the implications are direct. AI systems that can accurately attribute sources are more likely to cite your content correctly — and more likely to surface it in AI-generated responses. Frameworks like CiteGuard push AI systems from "plausible-sounding references" toward "verifiable attribution," which is exactly the direction generative engine optimization depends on.

The CiteGuard project was also featured at NeurIPS 2025, suggesting the research community sees citation attribution as a priority problem. The open-source code is available for integration into existing pipelines.

What to Watch

CiteGuard demonstrates that retrieval-augmented validation can close most of the gap between LLM citation behavior and human accuracy. The remaining 1.6 percentage points are not trivial — hard citations still stump the system at 15.2% accuracy — but the trajectory is clear.

The broader signal: citation verification is shifting from an afterthought to an infrastructure requirement. As AI-generated content proliferates, the systems that can prove their sources will win trust. The ones that cannot will lose it.

James Calder is the editor of The Search Signal, covering AI-powered search, generative engine optimization, and the future of brand discovery.