Table of Contents
Fetching ...

PersonaMatrix: A Recipe for Persona-Aware Evaluation of Legal Summarization

Tsz Fung Pang, Maryam Berijanian, Thomas Orth, Breanna Shi, Charlotte S. Alexander

TL;DR

PersonaMatrix tackles the need for stakeholder-sensitive evaluation in legal summarization by introducing a persona-by-criterion framework and a controlled dimension-shifted dataset. It combines persona-conditioned evaluators with a new Diversity-Coverage Index (DCI) to quantify between-persona divergence and within-persona coherence. The approach uses an Extractor→Rewriter→Validator pipeline and LLM-driven rubrics to generate and assess variants across three conflicting quality dimensions. Results show statistically significant divergences between persona-aware and generic judges, with interior optima suggesting multi-objective trade-offs. This work provides a scalable, practitioner-friendly path toward more accessible and useful legal AI summaries, with code and data publicly available.

Abstract

Legal documents are often long, dense, and difficult to comprehend, not only for laypeople but also for legal experts. While automated document summarization has great potential to improve access to legal knowledge, prevailing task-based evaluators overlook divergent user and stakeholder needs. Tool development is needed to encompass the technicality of a case summary for a litigator yet be accessible for a self-help public researching for their lawsuit. We introduce PersonaMatrix, a persona-by-criterion evaluation framework that scores summaries through the lens of six personas, including legal and non-legal users. We also introduce a controlled dimension-shifted pilot dataset of U.S. civil rights case summaries that varies along depth, accessibility, and procedural detail as well as Diversity-Coverage Index (DCI) to expose divergent optima of legal summary between persona-aware and persona-agnostic judges. This work enables refinement of legal AI summarization systems for both expert and non-expert users, with the potential to increase access to legal knowledge. The code base and data are publicly available in GitHub.

PersonaMatrix: A Recipe for Persona-Aware Evaluation of Legal Summarization

TL;DR

PersonaMatrix tackles the need for stakeholder-sensitive evaluation in legal summarization by introducing a persona-by-criterion framework and a controlled dimension-shifted dataset. It combines persona-conditioned evaluators with a new Diversity-Coverage Index (DCI) to quantify between-persona divergence and within-persona coherence. The approach uses an Extractor→Rewriter→Validator pipeline and LLM-driven rubrics to generate and assess variants across three conflicting quality dimensions. Results show statistically significant divergences between persona-aware and generic judges, with interior optima suggesting multi-objective trade-offs. This work provides a scalable, practitioner-friendly path toward more accessible and useful legal AI summaries, with code and data publicly available.

Abstract

Legal documents are often long, dense, and difficult to comprehend, not only for laypeople but also for legal experts. While automated document summarization has great potential to improve access to legal knowledge, prevailing task-based evaluators overlook divergent user and stakeholder needs. Tool development is needed to encompass the technicality of a case summary for a litigator yet be accessible for a self-help public researching for their lawsuit. We introduce PersonaMatrix, a persona-by-criterion evaluation framework that scores summaries through the lens of six personas, including legal and non-legal users. We also introduce a controlled dimension-shifted pilot dataset of U.S. civil rights case summaries that varies along depth, accessibility, and procedural detail as well as Diversity-Coverage Index (DCI) to expose divergent optima of legal summary between persona-aware and persona-agnostic judges. This work enables refinement of legal AI summarization systems for both expert and non-expert users, with the potential to increase access to legal knowledge. The code base and data are publicly available in GitHub.

Paper Structure

This paper contains 18 sections, 1 theorem, 3 equations, 3 figures, 1 table.

Key Result

Lemma 1

Shuffling persona labels across i.i.d. cases removes persona effects on optimal levels, preserving the marginal $\pi$ independently of $\mathcal{L}_d$ but making $\tilde{P}\perp \mathcal{L}_d$. Thus $I(\tilde{P};\mathcal{L}_d)=0$ and $I_d=0$, so $\mathrm{DCI}_d(\lambda)=(1-\lambda)D_d$. Moreover, th

Figures (3)

  • Figure 1: (a) Distribution of optimal (argmax) levels along three summary quality dimensions (Depth vs. Conciseness, Technical Precision vs. Lay Accessibility, and Procedural Focus vs. Narrative Story) for 6 persona judges and 2 generic judges over 25 civil rights cases. (b) Direction and magnitude of deviation of persona-aware judges' mean optimal levels from generic judges.
  • Figure 2: Demonstrations of how legal summary is rewritten in Controlled Dimension-Shifted Dataset Generator using the shift from Precision to Accessibility as example (top), and scored in PersonaMatrix Agentic Evaluator (bottom). There are $n$ PersonaCritics generating $n$ sets of evaluation criteria for $n$ personas in parallel.
  • Figure 3: (a) The changes of DCI broken down into components against the corruption ratio over Procedural dimension, (b) changes of final DCI against increasing corruption ratio. Bottom-right figure in (a) is the same as bottom-left figure in (b).

Theorems & Definitions (1)

  • Lemma 1: Shuffle Sanity