Table of Contents
Fetching ...

LawThinker: A Deep Research Legal Agent in Dynamic Environments

Xinyu Yang, Chenlong Deng, Tongyu Wen, Binyu Xie, Zhicheng Dou

TL;DR

LawThinker tackles the imperative for legally valid reasoning by enforcing verification after every knowledge exploration in dynamic judicial settings. Its Explore-Verify-Memorize framework, anchored by the DeepVerifier and two memory channels, robustly prevents error propagation and ensures procedural compliance across long-horizon tasks. Empirical results on the dynamic J1-EVAL benchmark and three static benchmarks show significant improvements in both outcome accuracy and process-oriented metrics, with strong performance in courtroom simulation contexts. The approach, supported by a suite of 15 specialized tools, offers a practical path toward reliable, legally grounded AI assistance in drafting, consultation, and adjudication tasks.

Abstract

Legal reasoning requires not only correct outcomes but also procedurally compliant reasoning processes. However, existing methods lack mechanisms to verify intermediate reasoning steps, allowing errors such as inapplicable statute citations to propagate undetected through the reasoning chain. To address this, we propose LawThinker, an autonomous legal research agent that adopts an Explore-Verify-Memorize strategy for dynamic judicial environments. The core idea is to enforce verification as an atomic operation after every knowledge exploration step. A DeepVerifier module examines each retrieval result along three dimensions of knowledge accuracy, fact-law relevance, and procedural compliance, with a memory module for cross-round knowledge reuse in long-horizon tasks. Experiments on the dynamic benchmark J1-EVAL show that LawThinker achieves a 24% improvement over direct reasoning and an 11% gain over workflow-based methods, with particularly strong improvements on process-oriented metrics. Evaluations on three static benchmarks further confirm its generalization capability. The code is available at https://github.com/yxy-919/LawThinker-agent .

LawThinker: A Deep Research Legal Agent in Dynamic Environments

TL;DR

LawThinker tackles the imperative for legally valid reasoning by enforcing verification after every knowledge exploration in dynamic judicial settings. Its Explore-Verify-Memorize framework, anchored by the DeepVerifier and two memory channels, robustly prevents error propagation and ensures procedural compliance across long-horizon tasks. Empirical results on the dynamic J1-EVAL benchmark and three static benchmarks show significant improvements in both outcome accuracy and process-oriented metrics, with strong performance in courtroom simulation contexts. The approach, supported by a suite of 15 specialized tools, offers a practical path toward reliable, legally grounded AI assistance in drafting, consultation, and adjudication tasks.

Abstract

Legal reasoning requires not only correct outcomes but also procedurally compliant reasoning processes. However, existing methods lack mechanisms to verify intermediate reasoning steps, allowing errors such as inapplicable statute citations to propagate undetected through the reasoning chain. To address this, we propose LawThinker, an autonomous legal research agent that adopts an Explore-Verify-Memorize strategy for dynamic judicial environments. The core idea is to enforce verification as an atomic operation after every knowledge exploration step. A DeepVerifier module examines each retrieval result along three dimensions of knowledge accuracy, fact-law relevance, and procedural compliance, with a memory module for cross-round knowledge reuse in long-horizon tasks. Experiments on the dynamic benchmark J1-EVAL show that LawThinker achieves a 24% improvement over direct reasoning and an 11% gain over workflow-based methods, with particularly strong improvements on process-oriented metrics. Evaluations on three static benchmarks further confirm its generalization capability. The code is available at https://github.com/yxy-919/LawThinker-agent .
Paper Structure (29 sections, 10 figures, 5 tables)

This paper contains 29 sections, 10 figures, 5 tables.

Figures (10)

  • Figure 1: An example of error propagation from an incorrect statute citation. While both methods reach the same affirmative outcome, direct reasoning cites an inapplicable article, while LawThinker mitigates this issue by exploring with explicit verification, underscoring the necessity of process-level compliance in legal reasoning beyond answer correctness.
  • Figure 2: Overview of our autonomous legal research agent LawThinker, which adopts an Explore-Verify-Memorize strategy, integrating iterative exploration with explicit verification during reasoning and interacting closely with a memory module.
  • Figure 3: Overview of tools used in exploration, verification, and memorization. Details of these tools can be found in Table \ref{['tab:tool']}.
  • Figure 4: Ablation study across six legal scenarios.
  • Figure 5: Performance with outcome-oriented and process-oriented metrics across models and reasoning paradigms.
  • ...and 5 more figures