Efficient Exploration of the Rashomon Set of Rule Set Models
Martino Ciaperoni, Han Xiao, Aristides Gionis
TL;DR
The paper tackles the challenge of exploring the Rashomon set of near-optimal rule-set models, rather than committing to a single best rule set, to enhance interpretability and fairness analysis. It formalizes a binary-rule framework with the objective $f(S;\lambda)=\ell(S)+\lambda|S|$, and defines the Rashomon set as $\mathcal{R}(\mathcal U,\lambda,\theta)=\{S:\; f(S;\lambda)\le\theta\}$. The authors introduce an exact branch-and-bound enumeration algorithm BB-enum with incremental computation and pruning bounds, plus approximation methods Approx-Sample and Approx-Count based on random parity constraints, and a faster BB-sts that uses a search-tree approach. Through extensive experiments on datasets including Compas, Mushrooms, Credit, and Voting, they show that near-uniform sampling and size estimation of the Rashomon set are feasible at scale, enabling reliable analysis of feature importance and fairness across many near-optimal rule sets. Overall, the work provides a scalable toolkit for non-exhaustive exploration of interpretable Rashomon sets, with practical implications for understanding task complexity, fairness, and variable importance in deployment contexts.
Abstract
Today, as increasingly complex predictive models are developed, simple rule sets remain a crucial tool to obtain interpretable predictions and drive high-stakes decision making. However, a single rule set provides a partial representation of a learning task. An emerging paradigm in interpretable machine learning aims at exploring the Rashomon set of all models exhibiting near-optimal performance. Existing work on Rashomon-set exploration focuses on exhaustive search of the Rashomon set for particular classes of models, which can be a computationally challenging task. On the other hand, exhaustive enumeration leads to redundancy that often is not necessary, and a representative sample or an estimate of the size of the Rashomon set is sufficient for many applications. In this work, we propose, for the first time, efficient methods to explore the Rashomon set of rule set models with or without exhaustive search. Extensive experiments demonstrate the effectiveness of the proposed methods in a variety of scenarios.
