Table of Contents
Fetching ...

HalluGraph: Auditable Hallucination Detection for Legal RAG Systems via Knowledge Graph Alignment

Valentin Noël, Elimane Yassine Seidou, Charly Ken Capo-Chichi, Ghanem Amari

TL;DR

HalluGraph tackles the accountability problem in legal RAG systems by moving beyond semantic similarity to graph-based fidelity checks. It builds knowledge graphs from context, queries, and responses, and measures fidelity through Entity Grounding and Relation Preservation, combined into a Composite Fidelity Index with full auditability. Empirically, it achieves high discrimination (AUC ≈ 0.979 on structured control tasks and ≈0.89 on legal tasks), outperforming baselines like BERTScore and NLI, while highlighting regime boundaries where the method is most effective. This approach provides auditable, explainable safeguards for high-stakes legal AI and offers a practical integration path for production RAG pipelines in regulated settings.

Abstract

Legal AI systems powered by retrieval-augmented generation (RAG) face a critical accountability challenge: when an AI assistant cites case law, statutes, or contractual clauses, practitioners need verifiable guarantees that generated text faithfully represents source documents. Existing hallucination detectors rely on semantic similarity metrics that tolerate entity substitutions, a dangerous failure mode when confusing parties, dates, or legal provisions can have material consequences. We introduce HalluGraph, a graph-theoretic framework that quantifies hallucinations through structural alignment between knowledge graphs extracted from context, query, and response. Our approach produces bounded, interpretable metrics decomposed into \textit{Entity Grounding} (EG), measuring whether entities in the response appear in source documents, and \textit{Relation Preservation} (RP), verifying that asserted relationships are supported by context. On structured control documents, HalluGraph achieves near-perfect discrimination ($>$400 words, $>$20 entities), HalluGraph achieves $AUC = 0.979$, while maintaining robust performance ($AUC \approx 0.89$) on challenging generative legal task, consistently outperforming semantic similarity baselines. The framework provides the transparency and traceability required for high-stakes legal applications, enabling full audit trails from generated assertions back to source passages.

HalluGraph: Auditable Hallucination Detection for Legal RAG Systems via Knowledge Graph Alignment

TL;DR

HalluGraph tackles the accountability problem in legal RAG systems by moving beyond semantic similarity to graph-based fidelity checks. It builds knowledge graphs from context, queries, and responses, and measures fidelity through Entity Grounding and Relation Preservation, combined into a Composite Fidelity Index with full auditability. Empirically, it achieves high discrimination (AUC ≈ 0.979 on structured control tasks and ≈0.89 on legal tasks), outperforming baselines like BERTScore and NLI, while highlighting regime boundaries where the method is most effective. This approach provides auditable, explainable safeguards for high-stakes legal AI and offers a practical integration path for production RAG pipelines in regulated settings.

Abstract

Legal AI systems powered by retrieval-augmented generation (RAG) face a critical accountability challenge: when an AI assistant cites case law, statutes, or contractual clauses, practitioners need verifiable guarantees that generated text faithfully represents source documents. Existing hallucination detectors rely on semantic similarity metrics that tolerate entity substitutions, a dangerous failure mode when confusing parties, dates, or legal provisions can have material consequences. We introduce HalluGraph, a graph-theoretic framework that quantifies hallucinations through structural alignment between knowledge graphs extracted from context, query, and response. Our approach produces bounded, interpretable metrics decomposed into \textit{Entity Grounding} (EG), measuring whether entities in the response appear in source documents, and \textit{Relation Preservation} (RP), verifying that asserted relationships are supported by context. On structured control documents, HalluGraph achieves near-perfect discrimination (400 words, 20 entities), HalluGraph achieves , while maintaining robust performance () on challenging generative legal task, consistently outperforming semantic similarity baselines. The framework provides the transparency and traceability required for high-stakes legal applications, enabling full audit trails from generated assertions back to source passages.

Paper Structure

This paper contains 12 sections, 1 theorem, 3 equations, 4 figures, 2 tables.

Key Result

Proposition 1

If $G_a$ is subgraph-isomorphic to $G_c \cup G_q$ via a label-preserving monomorphism, then $\text{EG} = 1$ and $\text{RP} = 1$.

Figures (4)

  • Figure 1: HalluGraph pipeline. Knowledge graphs are extracted from legal documents, queries, and responses. Alignment metrics (EG, RP) quantify fidelity with full traceability.
  • Figure 2: Discrimination power ($\Delta$) across synthetic and legal domains (factual $-$ hallucination scores). Blue: Entity Grounding (ours). Orange: NE Overlap. Gray: BERTScore. Our graph-based metric consistently outperforms semantic similarity, which fails to penalize entity errors in legal contexts.
  • Figure 3: Operating regime. Blue curve: Performance on synthetic control tasks improves with context length. Orange squares: Our Legal RAG datasets fall into the high-context regime ($>400$ words) and achieve robust discrimination ($AUC \approx 0.89$), significantly above the chance line (0.5).
  • Figure 4: Legal RAG integration. HalluGraph acts as a post-generation guardrail. Failed verifications trigger re-retrieval or human escalation.

Theorems & Definitions (1)

  • Proposition 1