Probabilistic Truly Unordered Rule Sets
Lincen Yang, Matthijs van Leeuwen
TL;DR
This work addresses interpretable multiclass classification by proposing Probabilistic Truly Unordered Rule Sets (TURS), which avoid implicit rule orders and handle overlaps through probabilistic union predictions. The authors formalize TURS as a probabilistic model, apply the Minimum Description Length (MDL) principle for model selection, and develop a novel dual-beam, diverse-patience heuristic with MDL-based local testing to learn rule sets efficiently. Empirical results on 31 datasets show TURS achieves competitive ROC-AUC, produces simpler models, and exhibits overlaps formed by similar probabilistic outputs, supporting the truly unordered premise. The approach yields trustworthy per-rule probability estimates that generalize well to unseen data, offering a practical framework for interpretable multiclass rule-based learning with principled uncertainty quantification.
Abstract
Rule set learning has recently been frequently revisited because of its interpretability. Existing methods have several shortcomings though. First, most existing methods impose orders among rules, either explicitly or implicitly, which makes the models less comprehensible. Second, due to the difficulty of handling conflicts caused by overlaps (i.e., instances covered by multiple rules), existing methods often do not consider probabilistic rules. Third, learning classification rules for multi-class target is understudied, as most existing methods focus on binary classification or multi-class classification via the ``one-versus-rest" approach. To address these shortcomings, we propose TURS, for Truly Unordered Rule Sets. To resolve conflicts caused by overlapping rules, we propose a novel model that exploits the probabilistic properties of our rule sets, with the intuition of only allowing rules to overlap if they have similar probabilistic outputs. We next formalize the problem of learning a TURS model based on the MDL principle and develop a carefully designed heuristic algorithm. We benchmark against a wide range of rule-based methods and demonstrate that our method learns rule sets that have lower model complexity and highly competitive predictive performance. In addition, we empirically show that rules in our model are empirically ``independent" and hence truly unordered.
