SORTeD Rashomon Sets of Sparse Decision Trees: Anytime Enumeration
Elif Arslan, Jacobus G. M. van der Linden, Serge Hoogendoorn, Marco Rinaldi, Emir Demirović
TL;DR
The paper tackles the challenge of leveraging Rashomon sets—collections of near-optimal sparse decision trees—for interpretable, high-stakes modeling. It introduces SORTD, an anytime, best-first framework that enumerates Rashomon sets in nondecreasing order of the objective, enabling early termination and efficient downstream tasks. A key innovation is a depth-two subroutine that dramatically speeds up computation and a caching-based design that scales to larger feature sets and depths, while supporting separable and totally ordered objectives with post-hoc evaluation of additional criteria such as fairness. Empirically, SORTD achieves up to two orders of magnitude faster runtime and much lower memory usage than the state of the art, and it demonstrates robust applicability to regression and multi-objective post-evaluation, making Rashomon-set analysis practical for real-world model selection and explanations.
Abstract
Sparse decision tree learning provides accurate and interpretable predictive models that are ideal for high-stakes applications by finding the single most accurate tree within a (soft) size limit. Rather than relying on a single "best" tree, Rashomon sets-trees with similar performance but varying structures-can be used to enhance variable importance analysis, enrich explanations, and enable users to choose simpler trees or those that satisfy stakeholder preferences (e.g., fairness) without hard-coding such criteria into the objective function. However, because finding the optimal tree is NP-hard, enumerating the Rashomon set is inherently challenging. Therefore, we introduce SORTD, a novel framework that improves scalability and enumerates trees in the Rashomon set in order of the objective value, thus offering anytime behavior. Our experiments show that SORTD reduces runtime by up to two orders of magnitude compared with the state of the art. Moreover, SORTD can compute Rashomon sets for any separable and totally ordered objective and supports post-evaluating the set using other separable (and partially ordered) objectives. Together, these advances make exploring Rashomon sets more practical in real-world applications.
