Analyzing Bias in Swiss Federal Supreme Court Judgments Using Facebook's Holistic Bias Dataset: Implications for Language Model Training
Sabine Wehnert, Muhammet Ertas, Ernesto William De Luca
TL;DR
Problem: biases in the Swiss Judgement Prediction Dataset may propagate to NLP judgments. Approach: leverage Holistic Bias dispreferred descriptors, multilingual expansion to German/French/Italian, extractive summarization and chunking to respect a $512$-token limit, fine-tune a legal-domain RoBERTa-large, and assess with a Binomial Significance Test and attention visualization. Contributions: (a) descriptor-based bias analysis in the SJP dataset across languages; (b) evidence that certain descriptors (e.g., 'victime', 'Opfer') correlate with biased predictions or translation artifacts; (c) demonstration of attention-attribution patterns and limitations of chunking and translation; (d) practical considerations for bias-aware training in multilingual legal NLP. Significance: informs data curation and model training to mitigate bias in legal NLP applications.
Abstract
Natural Language Processing (NLP) is vital for computers to process and respond accurately to human language. However, biases in training data can introduce unfairness, especially in predicting legal judgment. This study focuses on analyzing biases within the Swiss Judgment Prediction Dataset (SJP-Dataset). Our aim is to ensure unbiased factual descriptions essential for fair decision making by NLP models in legal contexts. We analyze the dataset using social bias descriptors from the Holistic Bias dataset and employ advanced NLP techniques, including attention visualization, to explore the impact of dispreferred descriptors on model predictions. The study identifies biases and examines their influence on model behavior. Challenges include dataset imbalance and token limits affecting model performance.
