Table of Contents
Fetching ...

Japanese Tort-case Dataset for Rationale-supported Legal Judgment Prediction

Hiroaki Yamada, Takenobu Tokunaga, Ryutaro Ohara, Akira Tokutsu, Keisuke Takeshita, Mihoko Sumida

TL;DR

The paper introduces the Japanese Tort-case Dataset (JTD), the first large-scale, real-judgment dataset for Japanese Legal Judgment Prediction, and defines two tasks: Tort Prediction and Rationale Extraction. It provides a detailed, multi-stage annotation pipeline with 41 legal experts and character-level rationale spans, supported by inter-annotator reliability analyses. A hierarchical Inter-Span Transformer (IST) architecture plus multi-task learning establishes strong baselines, showing that joint modeling of outcomes and rationales improves performance, albeit with substantial room for improvement relative to expert judgments. Error analysis highlights missing external knowledge and data limitations inherent to publicly available judgment documents, informing future dataset enhancements and modeling strategies. The work offers a valuable resource for Japanese legal NLP and sets a foundation for cross-jurisdictional, explainable LJP research in civil law contexts.

Abstract

This paper presents the first dataset for Japanese Legal Judgment Prediction (LJP), the Japanese Tort-case Dataset (JTD), which features two tasks: tort prediction and its rationale extraction. The rationale extraction task identifies the court's accepting arguments from alleged arguments by plaintiffs and defendants, which is a novel task in the field. JTD is constructed based on annotated 3,477 Japanese Civil Code judgments by 41 legal experts, resulting in 7,978 instances with 59,697 of their alleged arguments from the involved parties. Our baseline experiments show the feasibility of the proposed two tasks, and our error analysis by legal experts identifies sources of errors and suggests future directions of the LJP research.

Japanese Tort-case Dataset for Rationale-supported Legal Judgment Prediction

TL;DR

The paper introduces the Japanese Tort-case Dataset (JTD), the first large-scale, real-judgment dataset for Japanese Legal Judgment Prediction, and defines two tasks: Tort Prediction and Rationale Extraction. It provides a detailed, multi-stage annotation pipeline with 41 legal experts and character-level rationale spans, supported by inter-annotator reliability analyses. A hierarchical Inter-Span Transformer (IST) architecture plus multi-task learning establishes strong baselines, showing that joint modeling of outcomes and rationales improves performance, albeit with substantial room for improvement relative to expert judgments. Error analysis highlights missing external knowledge and data limitations inherent to publicly available judgment documents, informing future dataset enhancements and modeling strategies. The work offers a valuable resource for Japanese legal NLP and sets a foundation for cross-jurisdictional, explainable LJP research in civil law contexts.

Abstract

This paper presents the first dataset for Japanese Legal Judgment Prediction (LJP), the Japanese Tort-case Dataset (JTD), which features two tasks: tort prediction and its rationale extraction. The rationale extraction task identifies the court's accepting arguments from alleged arguments by plaintiffs and defendants, which is a novel task in the field. JTD is constructed based on annotated 3,477 Japanese Civil Code judgments by 41 legal experts, resulting in 7,978 instances with 59,697 of their alleged arguments from the involved parties. Our baseline experiments show the feasibility of the proposed two tasks, and our error analysis by legal experts identifies sources of errors and suggests future directions of the LJP research.
Paper Structure (28 sections, 1 equation, 5 figures, 12 tables)

This paper contains 28 sections, 1 equation, 5 figures, 12 tables.

Figures (5)

  • Figure 1: Rationale-supported legal judgment prediction featuring a tort case
  • Figure 2: An example about a defamation case. Translations are ours and modified for presentation purpose. $R_{g}^{P}$ and $R_{g}^{D}$ represent gold labels for $R^{P}$ and $R^{D}$, respectively.
  • Figure 3: Dataset Construction
  • Figure 4: Architecture of the IST models for legal judgment prediction with rationale extraction, where there are $N$ claims in total and $M$ out of $N$ claims are from Plaintiff. Fact embedding $E^{f}$ is shared across all claims, while party-type embeddings $E^{p}$ and $E^{d}$ are exclusive to the plaintiff's and defendant's claims, respectively. $E^{ps}_{n}$ is a position embedding for the $n$-th claim.
  • Figure 5: Confidence scores by human expert