Paragraph-level Rationale Extraction through Regularization: A case study on European Court of Human Rights Cases
Ilias Chalkidis, Manos Fergadiotis, Dimitrios Tsarapatsanis, Nikolaos Aletras, Ion Androutsopoulos, Prodromos Malakasiotis
TL;DR
The paper tackles explainability in legal NLP by shifting rationales from word-level to paragraph-level selections in European Court of Human Rights cases. It introduces a baseline HierBERT-based model and a set of rationale regularizers—Sparsity ($L_s$), Continuity ($L_c$), Comprehensiveness variants ($L_g$), and a new Singularity constraint ($L_r$)—to guide paragraph-level rationales while predicting alleged echr article violations. A new ECtHR dataset with 11k cases, silver and gold rationales, and a gold-annotated subset is released to support this task. Empirical results show that Continuity may not help in paragraph-level settings, while carefully reformulated comprehensiveness and the novel Singularity constraint improve rationale quality and faithfulness without sacrificing classification performance; gold rationales remain challenging, indicating ample room for future research and improvements in debiasing and evaluation. The work establishes a foundation for explainable, law-focused NLP and connects rationale extraction to self-supervised summarization perspectives in long legal documents.
Abstract
Interpretability or explainability is an emerging research field in NLP. From a user-centric point of view, the goal is to build models that provide proper justification for their decisions, similar to those of humans, by requiring the models to satisfy additional constraints. To this end, we introduce a new application on legal text where, contrary to mainstream literature targeting word-level rationales, we conceive rationales as selected paragraphs in multi-paragraph structured court cases. We also release a new dataset comprising European Court of Human Rights cases, including annotations for paragraph-level rationales. We use this dataset to study the effect of already proposed rationale constraints, i.e., sparsity, continuity, and comprehensiveness, formulated as regularizers. Our findings indicate that some of these constraints are not beneficial in paragraph-level rationale extraction, while others need re-formulation to better handle the multi-label nature of the task we consider. We also introduce a new constraint, singularity, which further improves the quality of rationales, even compared with noisy rationale supervision. Experimental results indicate that the newly introduced task is very challenging and there is a large scope for further research.
