Table of Contents
Fetching ...

Combining topic modelling and citation network analysis to study case law from the European Court on Human Rights on the right to respect for private and family life

M. Mohammadi, L. M. Bruijn, M. Wieling, M. Vols

TL;DR

The study tackles the challenge of navigating large-scale ECtHR Article 8 case law by evaluating three experiments that combine topic modelling and citation-network analysis. It first uses LDA to identify broad topics and then applies the Louvain algorithm to detect citation-based communities, finally integrating topic similarity as link weights to form more cohesive Eviction-focused groups. The results show that the hybrid approach retrieves more eviction-related cases (e.g., 211 detected from 361 candidates, including 83 new cases) than either method alone, demonstrating increased precision and coverage. This provides a scalable, interpretable framework for legal researchers to locate and analyze case law on specific rights issues, with practical implications for evidence gathering and doctrinal development across jurisdictions.

Abstract

As legal case law databases such as HUDOC continue to grow rapidly, it has become essential for legal researchers to find efficient methods to handle such large-scale data sets. Such case law databases usually consist of the textual content of cases together with the citations between them. This paper focuses on case law from the European Court of Human Rights on Article 8 of the European Convention of Human Rights, the right to respect private and family life, home and correspondence. In this study, we demonstrate and compare the potential of topic modelling and citation network to find and organize case law on Article 8 based on their general themes and citation patterns, respectively. Additionally, we explore whether combining these two techniques leads to better results compared to the application of only one of the methods. We evaluate the effectiveness of the combined method on a unique manually collected and annotated dataset of Aricle 8 case law on evictions. The results of our experiments show that our combined (text and citation-based) approach provides the best results in finding and grouping case law, providing scholars with an effective way to extract and analyse relevant cases on a specific issue.

Combining topic modelling and citation network analysis to study case law from the European Court on Human Rights on the right to respect for private and family life

TL;DR

The study tackles the challenge of navigating large-scale ECtHR Article 8 case law by evaluating three experiments that combine topic modelling and citation-network analysis. It first uses LDA to identify broad topics and then applies the Louvain algorithm to detect citation-based communities, finally integrating topic similarity as link weights to form more cohesive Eviction-focused groups. The results show that the hybrid approach retrieves more eviction-related cases (e.g., 211 detected from 361 candidates, including 83 new cases) than either method alone, demonstrating increased precision and coverage. This provides a scalable, interpretable framework for legal researchers to locate and analyze case law on specific rights issues, with practical implications for evidence gathering and doctrinal development across jurisdictions.

Abstract

As legal case law databases such as HUDOC continue to grow rapidly, it has become essential for legal researchers to find efficient methods to handle such large-scale data sets. Such case law databases usually consist of the textual content of cases together with the citations between them. This paper focuses on case law from the European Court of Human Rights on Article 8 of the European Convention of Human Rights, the right to respect private and family life, home and correspondence. In this study, we demonstrate and compare the potential of topic modelling and citation network to find and organize case law on Article 8 based on their general themes and citation patterns, respectively. Additionally, we explore whether combining these two techniques leads to better results compared to the application of only one of the methods. We evaluate the effectiveness of the combined method on a unique manually collected and annotated dataset of Aricle 8 case law on evictions. The results of our experiments show that our combined (text and citation-based) approach provides the best results in finding and grouping case law, providing scholars with an effective way to extract and analyse relevant cases on a specific issue.
Paper Structure (20 sections, 5 figures, 4 tables)

This paper contains 20 sections, 5 figures, 4 tables.

Figures (5)

  • Figure 1: Coherence scores capturing semantic similarity between the most prominent words for each topic.
  • Figure 2: Overview of the most prominent words within each topic where the size of a word captures the likelihood of its occurrence within a topic. Note that CAP is an abbreviation for Constitutional and administrative proceedings.
  • Figure 3: Visualization (via t-SNE) of case laws based on their topics. A color is assigned to each case indicating the most significant topic within the case.
  • Figure 4: Visualization of LDA's output, using t-SNE, for both judgments and decisions.
  • Figure 5: Topic distribution within eviction-related clusters.