Data-Driven Law Firm Rankings to Reduce Information Asymmetry in Legal Disputes

Alexandre Mojon; Robert Mahari; Sandro Claudio Lera

Data-Driven Law Firm Rankings to Reduce Information Asymmetry in Legal Disputes

Alexandre Mojon, Robert Mahari, Sandro Claudio Lera

TL;DR

This work introduces a new ranking framework that treats each lawsuit as a competitive game between plaintiff and defendant law firms, and shows that existing reputation-based rankings correlate poorly with actual litigation success, whereas the outcome-based ranking substantially improves predictive accuracy.

Abstract

Selecting capable counsel can shape the outcome of litigation, yet evaluating law firm performance remains challenging. Widely used rankings prioritize prestige, size, and revenue rather than empirical litigation outcomes, offering little practical guidance. To address this gap, we build on the Bradley-Terry model and introduce a new ranking framework that treats each lawsuit as a competitive game between plaintiff and defendant law firms. Leveraging a newly constructed dataset of 60,540 U.S. civil lawsuits involving 54,541 law firms, our findings show that existing reputation-based rankings correlate poorly with actual litigation success, whereas our outcome-based ranking substantially improves predictive accuracy. These findings establish a foundation for more transparent, data-driven assessments of legal performance.

Data-Driven Law Firm Rankings to Reduce Information Asymmetry in Legal Disputes

TL;DR

Abstract

Paper Structure (13 sections, 13 equations, 19 figures, 3 tables)

This paper contains 13 sections, 13 equations, 19 figures, 3 tables.

Transformer-Based Case Labeling
Case Outcome Classification
Case Type Classification
Summary
Extracting Law Firm Names and Roles
String Matching
Clustering Law Firm Names
Sub-sampling Pairwise Interactions via $Q$-Factor
Case Statistics
Fitting AHPI Algorithm on Synthetic Data
Legal Case Outcome Predictions for Various $Q$-Factors
Use of Sigmoid Function in Ranking Algorithm
Public Law Firm Rankings

Figures (19)

Figure 1: We analyze $N$ lawsuits based on judges' textual decisions (opinions), from which we extract structured information on the case type, the case outcome, and the law firms involved. Each lawsuit is modeled as a game between the plaintiff's and defendant's law firm, where one firm wins and the other loses. Each of the five case types has an associated defendant bias (home field advantage) $\epsilon_m$, which quantifies the a priori likelihood that the defendant prevails in case type $m$. Each law firm $k = 1, \ldots, K$ is assigned a latent skill score, $S_k$. The propensity that defendant's law firm $B$ is favored over the plaintiff's law firm $A$ is modeled as a sigmoid function of $(S_B + \epsilon_m) - S_A$, such that a higher bias-adjusted score for $B$ increases its likelihood of being favored. The valence probability $q_m$ then determines the probability that the favored firm ultimately wins, capturing the role of uncertainty in case outcomes. Using these assumptions, we apply an expectation-maximization algorithm to infer the latent law firm scores $\{ S_k \}$, defendant biases $\{\epsilon_m\}$, and valence probabilities $\{q_m\}$ that best explain the observed litigation outcomes. These rankings are then used to predict case outcomes, outperforming existing reputation-based rankings (Figure \ref{['fig:rank_clock']}). The inferred defendant biases $\epsilon_m$ are consistently positive (Table \ref{['tab:case_summary']}), reflecting the well-documented advantage of defendants in litigation. The valence probabilities $q_m$ are found to be high (between 85% and 100%, as shown in Table \ref{['tab:case_summary']}), underscoring the significant role law firms play in shaping case outcomes.
Figure 1: Trade-off between data augmentation accuracy and dataset size. Accuracy on test data (left y-axis) and number of retained cases (right y-axis) as a function of the classifier confidence threshold $\tau$.
Figure 2: We compare our AHPI scores with three widely used firm rankings: Vault 100, ALM's Global 200 and Embroker Top 300. (top) Correlation between AHPI ranking and each of the these rankings, and predictive accuracy on test cases given each ranking, where 50% represents the expected accuracy from random guessing. (bottom) Comparison of ranks for 20 law firms which are common across all four rankings. A point close to the periphery means a higher rank than one closer to the center. As visually apparent and indicated by the low correlation coefficients in the 'Correlation' column, there are significant differences between AHPI scores and other rankings.
Figure 2: Decreasing number of interactions (left) and entities (right) as the $Q$-factor is increased.
Figure 3: Predicted propensity of a defendant win for out-of-sample test cases, grouped into six bins. (Top) Number of cases within each bin. (Bottom) Winning propensity based on the AHPI ranking compared to actual defendant win rate in each bin. Error bars indicate standard deviations computed via 100 bootstrap resamples. The dotted line corresponds to the 83% baseline defendant win-rate across all cases. Empirical defendant win-rates for cases with low propensities are significantly below the baseline while win-rates for cases with high propensities are significantly above the baseline, highlighting the model's ability to predict case outcomes based on law firm rankings.
...and 14 more figures

Data-Driven Law Firm Rankings to Reduce Information Asymmetry in Legal Disputes

TL;DR

Abstract

Data-Driven Law Firm Rankings to Reduce Information Asymmetry in Legal Disputes

Authors

TL;DR

Abstract

Table of Contents

Figures (19)