Table of Contents
Fetching ...

CASHomon Sets: Efficient Rashomon Sets Across Multiple Model Classes and their Hyperparameters

Fiona Katharina Ewald, Martin Binder, Matthias Feurer, Bernd Bischl, Giuseppe Casalicchio

Abstract

Rashomon sets are model sets within one model class that perform nearly as well as a reference model from the same model class. They reveal the existence of alternative well-performing models, which may support different interpretations. This enables selecting models that match domain knowledge, hidden constraints, or user preferences. However, efficient construction methods currently exist for only a few model classes. Applied machine learning usually searches many model classes, and the best class is unknown beforehand. We therefore study Rashomon sets in the combined algorithm selection and hyperparameter optimization (CASH) setting and call them CASHomon sets. We propose TruVaRImp, a model-based active learning algorithm for level set estimation with an implicit threshold, and provide convergence guarantees. On synthetic and real-world datasets, TruVaRImp reliably identifies CASHomon sets members and matches or outperforms naive sampling, Bayesian optimization, classical and implicit level set estimation methods, and other baselines. Our analyses of predictive multiplicity and feature-importance variability across model classes question the common practice of interpreting data through a single model class.

CASHomon Sets: Efficient Rashomon Sets Across Multiple Model Classes and their Hyperparameters

Abstract

Rashomon sets are model sets within one model class that perform nearly as well as a reference model from the same model class. They reveal the existence of alternative well-performing models, which may support different interpretations. This enables selecting models that match domain knowledge, hidden constraints, or user preferences. However, efficient construction methods currently exist for only a few model classes. Applied machine learning usually searches many model classes, and the best class is unknown beforehand. We therefore study Rashomon sets in the combined algorithm selection and hyperparameter optimization (CASH) setting and call them CASHomon sets. We propose TruVaRImp, a model-based active learning algorithm for level set estimation with an implicit threshold, and provide convergence guarantees. On synthetic and real-world datasets, TruVaRImp reliably identifies CASHomon sets members and matches or outperforms naive sampling, Bayesian optimization, classical and implicit level set estimation methods, and other baselines. Our analyses of predictive multiplicity and feature-importance variability across model classes question the common practice of interpreting data through a single model class.
Paper Structure (37 sections, 5 theorems, 40 equations, 8 figures, 8 tables, 2 algorithms)

This paper contains 37 sections, 5 theorems, 40 equations, 8 figures, 8 tables, 2 algorithms.

Key Result

lemma 1

For a given $\delta \in (0,1)$ and a candidate set $D$, let $\beta_t \ge 2 \log \frac{|D|t^2\pi^2}{6\delta}$. Then with probability at least $1-\delta$, the true $c(\lambda)$ is in $[ \mu_t(\lambda) - \beta_t^{1/2}\sigma_t(\lambda), \mu_t(\lambda) + \beta_t^{1/2}\sigma_t(\lambda)]$ for all $\lambda

Figures (8)

  • Figure 1: Illustration of the TruVaRImp algorithm. The true (unknown) function is shown in green, with known function values $c(\lambda)$ as black dots, through which a GP model is fit (purple line with transparent confidence region). $M_t$ is the set of potential minimizers (purple strip at the bottom), for which the confidence interval crosses the "pessimistic" (i.e., highest possible within confidence region) minimum $c_{\mathrm{min},t}^\mathrm{pes}$ (upper red dashed line). Configuration points $\lambda$ are classified as belonging to the lower or upper set ($L_t$ and $H_t$, not shown), or remain unclassified ($U_t$, blue strip at the bottom), depending on whether their confidence interval crosses the region of plausible threshold values $[h_t^\mathrm{opt}, h_t^\mathrm{pes}]$ (blue dashed lines). In this illustration, these are relative to the minimum with $\varepsilon_{\mathrm{rel}} = 2, \varepsilon_{\mathrm{abs}} = 0$. TruVaRImp selects points based on how much evaluating them will reduce the variance for candidates in both $M_t$ and $U_t$ that goes beyond their target confidence values $\eta_{(i)}$ the most (green and purple ribbons around the GP mean).
  • Figure 2: Mean CASHomon set algorithm progress in terms of F1 score of the surrogate model predicting CASHomon set membership. The iteration count does not include the initial sample of 30 points per model class. The left facet shows the mean values over all datasets. The middle and right show individual performance on two example datasets.
  • Figure 3: Test performance and RC for tasks BC, CR, and CS (binarized) across the TreeFARMS Rashomon set and our CASHomon set. Top row: Brier score distributions. Bottom row: RC (y-axis) versus Brier score (x-axis); horizontal boxplots summarize test score distributions; triangles indicate the reference model.
  • Figure 4: VICs for our CASHomon set (left) and for the TreeFARMS Rashomon set (right) on task CS. For each feature, PFI values across models are shown as a boxplot together with a point cloud colored by model class, where identical values are vertically jittered. PFI values are scaled: the maximal PFI value for a model equals 1. In our CASHomon set, most models are xgb models (178), followed by nnet (66) and cart (33) models, but all model classes are present.
  • Figure 5: Visualization of Rashomon sets on a simple regression problem. Every coordinate on the plot corresponds to a linear model with given coefficients. The "true" model $f^{\star}$ lies at $(1, 1)$ and has RMSE $0.5$. The black ellipse indicates the Rashomon set when taking $f_{\text{ref}}=f^{\star}$, $\varepsilon=0.15$, and the true risk $\mathcal{R}$. The red cross indicates $\hat{f^{\star}}$, a model that minimizes the empirical risk $\mathcal{R}_{\text{emp}}$, and the red ellipse represents the Rashomon set when using this model as reference. The shaded blue area represents $\mathcal{H}^{\mathcal{D}}_\text{HPO}$ as defined in Eq. \ref{['eq_F_HP']}, the models resulting from fitting an elastic net model (using the glmnet R-package zou-jrsssb05a) and parameterized with the lambda and alpha HPs, on the specific available dataset. The shade of blue indicates the value of alpha; lambda is not shown (though models with larger lambda values lie closer to the origin, corresponding to the regularizing effect of lambda). The shaded area within the red ellipse is the intersection of the Rashomon set and the models achievable through HP variation.
  • ...and 3 more figures

Theorems & Definitions (9)

  • lemma 1: Srinivas et al., 2012
  • lemma 2
  • proof : by induction
  • lemma 3
  • lemma 4
  • proof
  • definition 1
  • theorem 1
  • proof