Table of Contents
Fetching ...

Early Career Citations Capture Judicial Idiosyncrasies and Predict Judgments

Robert Mahari, Sandro Claudio Lera

Abstract

Judicial impartiality is a cornerstone of well-functioning legal systems. We assemble a dataset of 112,312 civil lawsuits in U.S. District Courts to study the effect of extraneous factors on judicial decision making. We show that cases are randomly assigned to judges and that biographical judge features are predictive of judicial decisions. We use low-dimensional representations of judges' early-career citation records as generic representations of judicial idiosyncrasies. These predict future judgments with accuracies exceeding 65% for high-confidence predictions on balanced out-of-sample test cases. For 6-8% of judges, these representations are significant predictors across all judgments. These findings indicate that a small but significant group of judges routinely relies on extraneous factors and careful vetting of judges prior to appointment may partially address this issue. Our use of low-dimensional representations of citation records may also be generalized to other jurisdictions or to study other aspects of judicial decision making.

Early Career Citations Capture Judicial Idiosyncrasies and Predict Judgments

Abstract

Judicial impartiality is a cornerstone of well-functioning legal systems. We assemble a dataset of 112,312 civil lawsuits in U.S. District Courts to study the effect of extraneous factors on judicial decision making. We show that cases are randomly assigned to judges and that biographical judge features are predictive of judicial decisions. We use low-dimensional representations of judges' early-career citation records as generic representations of judicial idiosyncrasies. These predict future judgments with accuracies exceeding 65% for high-confidence predictions on balanced out-of-sample test cases. For 6-8% of judges, these representations are significant predictors across all judgments. These findings indicate that a small but significant group of judges routinely relies on extraneous factors and careful vetting of judges prior to appointment may partially address this issue. Our use of low-dimensional representations of citation records may also be generalized to other jurisdictions or to study other aspects of judicial decision making.
Paper Structure (18 sections, 20 figures, 9 tables)

This paper contains 18 sections, 20 figures, 9 tables.

Figures (20)

  • Figure 1: Distribution of plaintiff win rates across judges. Red bars correspond to win rates that deviate significantly from the baseline. Significance is measured at the 10% level of a two-sided binomial test. The dotted line shows the distribution of win rates sampled from the binomial null-model $B(p=0.25,n)$. (inset) Associated distribution of p-values. 38% of all judges deviate from the baseline in a statistically significant manner. This is more than three times higher than the rate of false positives one would expect at the 10% level.
  • Figure 2: Accuracy of the gradient boost probabilistic classifier that predicts case outcome. The model was trained with biographic judge features and we report the accuracy per confidence score quantile for each case type.
  • Figure 3: Shapley feature importance for civil rights cases. The feature importance highlights that past win-rate is the most important feature, with past high win rates indicating future high win rates. This suggests a persistent effect of judicial idiosyncrasies. Similar plots for other case types are reported in the SI Appendix.
  • Figure 4: Overview of predictions made with early-career citation embeddings. (left) For a random selection of 20 judges, we show the normalized citation count of the 40 overall most commonly cited cases. We only count citations from the first 10% cases of their career. In practice, we consider for each of 2,394 judges the normalized count of the 2,403 most commonly cited cases. (middle) NMF dimension reduction that compresses the 40 counts into 3-dimensional embeddings. In practice, we compress the counts of the 2,403 most frequently cited cases into 30 dimensional embeddings. (right) Gradient boost classification accuracy based on these 30 dimensional embeddings, separated by confidence score quintile and case type.
  • Figure 5: Significance of aggregated predictions over judges' careers. For each judge we balance the out of sample data such that half the cases are won by the plaintiff. We only consider judges for which we have at least 50 out of sample cases. For each judge, we analyze the accuracy of predicting case outcomes. Our null hypothesis is that predictability is binomially distributed with $p=0.5$, i.e. there is no excess predictability. (top) Distribution of win rates, colored by whether the predictability is below (red), within (gray) or above (green) the 90% confidence interval. (bottom) Judges sorted by prediction accuracy along with their respective 90% confidence intervals (gray bar). Green (red) dots are significantly over- (under-) performing beyond the 90% confidence interval of what is expected from random guessing.
  • ...and 15 more figures