Can LLMs Synthesize Court-Ready Statistical Evidence? Evaluating AI-Assisted Sentencing Bias Analysis for California Racial Justice Act Claims

Aparna Komarla

Can LLMs Synthesize Court-Ready Statistical Evidence? Evaluating AI-Assisted Sentencing Bias Analysis for California Racial Justice Act Claims

Aparna Komarla

TL;DR

The authors' evaluations comparing LLM performance to statisticians using the LLM-as-a-Judge framework suggest that AI can serve as a powerful descriptive assistant for real-time evidence generation when ethically incorporated in the analysis pipeline.

Abstract

Resentencing in California remains a complex legal challenge despite legislative reforms like the Racial Justice Act (2020), which allows defendants to challenge convictions based on statistical evidence of racial disparities in sentencing and charging. Policy implementation lags behind legislative intent, creating a 'second-chance gap' where hundreds of resentencing opportunities remain unidentified. We present Redo.io, an open-source platform that processes 95,000 prison records acquired under the California Public Records Act (CPRA) and generates court-ready statistical evidence of racial bias in sentencing for prima facie and discovery motions. We explore the design of an LLM-powered interpretive layer that synthesizes results from statistical methods like Odds Ratio, Relative Risk, and Chi-Square Tests into cohesive narratives contextualized with confidence intervals, sample sizes, and data limitations. Our evaluations comparing LLM performance to statisticians using the LLM-as-a-Judge framework suggest that AI can serve as a powerful descriptive assistant for real-time evidence generation when ethically incorporated in the analysis pipeline.

Can LLMs Synthesize Court-Ready Statistical Evidence? Evaluating AI-Assisted Sentencing Bias Analysis for California Racial Justice Act Claims

TL;DR

Abstract

Paper Structure (40 sections, 3 equations, 3 figures, 7 tables)

This paper contains 40 sections, 3 equations, 3 figures, 7 tables.

Introduction
Legislative Reform History
Use of Statistics in Resentencing Claims
RJA Implementation Gaps
Research Question
AI Assistant for Bias Analysis and Court-Ready Evidence
System in Action
Value Added from AI Analysis
Implementation and Results
Evaluation of AI Assistant for Bias Analysis
Bias Analysis Rubric
Results
Limitations
Observational Study Constraints
Evaluation Constraints
...and 25 more sections

Figures (3)

Figure 1: Process of synthesizing statistical findings and generating an evidence report. See Appendix H for a sample report.
Figure 2: View of https://tool.redoio.info/bias_analysis to evaluate the strength of the association between demographics and sentencing outcomes (e.g. 'Third Striker'). See Appendix G for an example of the bias analysis report.
Figure 3: Standard deviation of LLM-as-a-Judge scores across 15 evaluation runs per report. Higher values (darker cells) indicate greater scoring variability. Cross-Method Comparison (Cmp) exhibits the highest variability, while Limitations (Lim) and Attribution (Att) are nearly constant.

Can LLMs Synthesize Court-Ready Statistical Evidence? Evaluating AI-Assisted Sentencing Bias Analysis for California Racial Justice Act Claims

TL;DR

Abstract

Can LLMs Synthesize Court-Ready Statistical Evidence? Evaluating AI-Assisted Sentencing Bias Analysis for California Racial Justice Act Claims

Authors

TL;DR

Abstract

Table of Contents

Figures (3)