Table of Contents
Fetching ...

Efficient Exploration of the Rashomon Set of Rule Set Models

Martino Ciaperoni, Han Xiao, Aristides Gionis

TL;DR

The paper tackles the challenge of exploring the Rashomon set of near-optimal rule-set models, rather than committing to a single best rule set, to enhance interpretability and fairness analysis. It formalizes a binary-rule framework with the objective $f(S;\lambda)=\ell(S)+\lambda|S|$, and defines the Rashomon set as $\mathcal{R}(\mathcal U,\lambda,\theta)=\{S:\; f(S;\lambda)\le\theta\}$. The authors introduce an exact branch-and-bound enumeration algorithm BB-enum with incremental computation and pruning bounds, plus approximation methods Approx-Sample and Approx-Count based on random parity constraints, and a faster BB-sts that uses a search-tree approach. Through extensive experiments on datasets including Compas, Mushrooms, Credit, and Voting, they show that near-uniform sampling and size estimation of the Rashomon set are feasible at scale, enabling reliable analysis of feature importance and fairness across many near-optimal rule sets. Overall, the work provides a scalable toolkit for non-exhaustive exploration of interpretable Rashomon sets, with practical implications for understanding task complexity, fairness, and variable importance in deployment contexts.

Abstract

Today, as increasingly complex predictive models are developed, simple rule sets remain a crucial tool to obtain interpretable predictions and drive high-stakes decision making. However, a single rule set provides a partial representation of a learning task. An emerging paradigm in interpretable machine learning aims at exploring the Rashomon set of all models exhibiting near-optimal performance. Existing work on Rashomon-set exploration focuses on exhaustive search of the Rashomon set for particular classes of models, which can be a computationally challenging task. On the other hand, exhaustive enumeration leads to redundancy that often is not necessary, and a representative sample or an estimate of the size of the Rashomon set is sufficient for many applications. In this work, we propose, for the first time, efficient methods to explore the Rashomon set of rule set models with or without exhaustive search. Extensive experiments demonstrate the effectiveness of the proposed methods in a variety of scenarios.

Efficient Exploration of the Rashomon Set of Rule Set Models

TL;DR

The paper tackles the challenge of exploring the Rashomon set of near-optimal rule-set models, rather than committing to a single best rule set, to enhance interpretability and fairness analysis. It formalizes a binary-rule framework with the objective , and defines the Rashomon set as . The authors introduce an exact branch-and-bound enumeration algorithm BB-enum with incremental computation and pruning bounds, plus approximation methods Approx-Sample and Approx-Count based on random parity constraints, and a faster BB-sts that uses a search-tree approach. Through extensive experiments on datasets including Compas, Mushrooms, Credit, and Voting, they show that near-uniform sampling and size estimation of the Rashomon set are feasible at scale, enabling reliable analysis of feature importance and fairness across many near-optimal rule sets. Overall, the work provides a scalable toolkit for non-exhaustive exploration of interpretable Rashomon sets, with practical implications for understanding task complexity, fairness, and variable importance in deployment contexts.

Abstract

Today, as increasingly complex predictive models are developed, simple rule sets remain a crucial tool to obtain interpretable predictions and drive high-stakes decision making. However, a single rule set provides a partial representation of a learning task. An emerging paradigm in interpretable machine learning aims at exploring the Rashomon set of all models exhibiting near-optimal performance. Existing work on Rashomon-set exploration focuses on exhaustive search of the Rashomon set for particular classes of models, which can be a computationally challenging task. On the other hand, exhaustive enumeration leads to redundancy that often is not necessary, and a representative sample or an estimate of the size of the Rashomon set is sufficient for many applications. In this work, we propose, for the first time, efficient methods to explore the Rashomon set of rule set models with or without exhaustive search. Extensive experiments demonstrate the effectiveness of the proposed methods in a variety of scenarios.
Paper Structure (46 sections, 15 theorems, 19 equations, 8 figures, 3 tables, 15 algorithms)

This paper contains 46 sections, 15 theorems, 19 equations, 8 figures, 3 tables, 15 algorithms.

Key Result

Theorem 1

For any rule set $S \subseteq \mathcal{U}\xspace$ and any $S' \subseteq \mathcal{U}\xspace$ that starts with $S$, it is $f\xspace(S') \ge b\xspace(S)$.

Figures (8)

  • Figure 1: A Rashomon set of rule sets in the Compas dataset. Each rule set is plotted as a point, whose position is determined by the statistical parity (SP) calders2009building of the rule set on race and gender (in the $x$ and $y$ axis, respectively). Rule sets are colored by their accuracy scores (ACC). Two example rule sets with similar accuracy, but highly different statistical parity on race, are additionally presented.
  • Figure 2: Graphical representation of the search carried out by the $\textsc{BB\-sts}$ algorithm. The exploration of the search tree starts from the root, indicated by a diamond, which represents the empty rule set $\emptyset$. The sampled partial solutions are depth $i-1$ are highlighted in red. For each sampled partial solution $S$, $\textsc{BB\-sts}$ carries out $\textsc{BB\-enum}$-like search to find all larger partial solutions $\{S'\}_i^S$ at depth $i$ (highlighted in blue) that are subsets of the first $i \cdot \ell$ rules (i.e., they are of level $i \cdot \ell$) and that start with $S$. $\textsc{BB\-sts}$ repeats this procedure until, upon reaching the leaves of the search tree, partial solutions of level $M$ are generated.
  • Figure 3: Runtime (in seconds, top row) and estimated ${\left|\mathcal{R}\xspace\!\left(\mathcal{U}\xspace\right)\xspace\right|}$ (in log scale, bottom row) versus objective upper bound $\theta\xspace$.
  • Figure 4: Runtime (in seconds, top row) by number of rules and estimated ${\left|\mathcal{R}\xspace\!\left(\mathcal{U}\xspace\right)\xspace\right|}$ (in log scale, bottom row) versus number of rules (on log scale).
  • Figure 5: Compas dataset. Estimated feature importance against the ground-truth. 95% confidence intervals are shown as black lines.
  • ...and 3 more figures

Theorems & Definitions (28)

  • Theorem 1: Hierarchical objective lower bound
  • Theorem 2: Look-ahead lower bound
  • Theorem 3: Rule set size bound
  • Theorem 4: Lower bound update
  • Theorem 5: Objective update
  • Theorem 6
  • Proposition 1
  • Theorem 7: Extended look-ahead bound
  • proof
  • proof
  • ...and 18 more