Table of Contents
Fetching ...

Agents on the Bench: Large Language Model Based Multi Agent Framework for Trustworthy Digital Justice

Cong Jiang, Xiaolei Yang

TL;DR

AgentsBench introduces a large-language-model based multi-agent framework to simulate a collegiate judicial bench for trustworthy digital justice. By deploying heterogeneous LLM-driven agents that independently sentence, deliberate, and reach a consensus under a presiding judge, the approach enhances decision quality, fairness, and explainability. Empirical evaluation on the Prison Term Prediction task from LawBench shows AgentsBench surpassing single-model baselines in performance and morality, with a near-gold consensus in a complex bribery-fraud case. The framework offers a scalable, transparent path to more nuanced AI-assisted judicial decision-making across diverse case types and jurisdictions.

Abstract

The justice system has increasingly employed AI techniques to enhance efficiency, yet limitations remain in improving the quality of decision-making, particularly regarding transparency and explainability needed to uphold public trust in legal AI. To address these challenges, we propose a large language model based multi-agent framework named AgentsBench, which aims to simultaneously improve both efficiency and quality in judicial decision-making. Our approach leverages multiple LLM-driven agents that simulate the collaborative deliberation and decision making process of a judicial bench. We conducted experiments on legal judgment prediction task, and the results show that our framework outperforms existing LLM based methods in terms of performance and decision quality. By incorporating these elements, our framework reflects real-world judicial processes more closely, enhancing accuracy, fairness, and society consideration. AgentsBench provides a more nuanced and realistic methods of trustworthy AI decision-making, with strong potential for application across various case types and legal scenarios.

Agents on the Bench: Large Language Model Based Multi Agent Framework for Trustworthy Digital Justice

TL;DR

AgentsBench introduces a large-language-model based multi-agent framework to simulate a collegiate judicial bench for trustworthy digital justice. By deploying heterogeneous LLM-driven agents that independently sentence, deliberate, and reach a consensus under a presiding judge, the approach enhances decision quality, fairness, and explainability. Empirical evaluation on the Prison Term Prediction task from LawBench shows AgentsBench surpassing single-model baselines in performance and morality, with a near-gold consensus in a complex bribery-fraud case. The framework offers a scalable, transparent path to more nuanced AI-assisted judicial decision-making across diverse case types and jurisdictions.

Abstract

The justice system has increasingly employed AI techniques to enhance efficiency, yet limitations remain in improving the quality of decision-making, particularly regarding transparency and explainability needed to uphold public trust in legal AI. To address these challenges, we propose a large language model based multi-agent framework named AgentsBench, which aims to simultaneously improve both efficiency and quality in judicial decision-making. Our approach leverages multiple LLM-driven agents that simulate the collaborative deliberation and decision making process of a judicial bench. We conducted experiments on legal judgment prediction task, and the results show that our framework outperforms existing LLM based methods in terms of performance and decision quality. By incorporating these elements, our framework reflects real-world judicial processes more closely, enhancing accuracy, fairness, and society consideration. AgentsBench provides a more nuanced and realistic methods of trustworthy AI decision-making, with strong potential for application across various case types and legal scenarios.

Paper Structure

This paper contains 24 sections, 7 equations, 1 figure, 1 table.

Figures (1)

  • Figure 1: Overview of the AgentsBench Framework. The figure illustrates the framework simulating a judicial decision-making process. It features an agent bench consisting of two lay judges and one professional judge. Each agent initially proposes an independent sentencing decision based on the case details. Subsequently, the agents engage in multi-round deliberation, moderated by the professional judge, to reconcile differing perspectives and achieve consensus. This collaborative process reflects the essence of AgentsBench, leveraging diverse viewpoints to reach a balanced and fair final judgment, which considers both legal standards and social effects.