Table of Contents
Fetching ...

Partial Order in Chaos: Consensus on Feature Attributions in the Rashomon Set

Gabriel Laberge, Yann Pequignot, Alexandre Mathieu, Foutse Khomh, Mario Marchand

TL;DR

The paper tackles the problem of post-hoc explanations under model under-specification by shifting from single-model attributions to statements that hold across the Rashomon Set of all good models, yielding partial orders over local and global feature importance. It formalizes a consensus-based framework that uses optimization over ellipsoids and combinatorial relaxations to derive trustworthy local and global attribution relations, while introducing error-tolerance controls (capture bounds and heuristics) and sensitivity analyses. The authors instantiate the framework across Additive Regression, Kernel Ridge, and Random Forests, deriving practical procedures for local and global consensus and demonstrating, on real datasets like Kaggle-Houses and Adult-Income, that partial orders provide robust, cautious interpretations even when under-specification is high. The work highlights that consensus-based partial orders can preserve informative explanations while avoiding overconfident or conflicting claims, with broad implications for interpretability in ML deployments. Overall, the approach offers a principled path to reliable explanations that respect uncertainty in model choice and data noise, potentially guiding safer decision-making in high-stakes domains.

Abstract

Post-hoc global/local feature attribution methods are progressively being employed to understand the decisions of complex machine learning models. Yet, because of limited amounts of data, it is possible to obtain a diversity of models with good empirical performance but that provide very different explanations for the same prediction, making it hard to derive insight from them. In this work, instead of aiming at reducing the under-specification of model explanations, we fully embrace it and extract logical statements about feature attributions that are consistent across all models with good empirical performance (i.e. all models in the Rashomon Set). We show that partial orders of local/global feature importance arise from this methodology enabling more nuanced interpretations by allowing pairs of features to be incomparable when there is no consensus on their relative importance. We prove that every relation among features present in these partial orders also holds in the rankings provided by existing approaches. Finally, we present three use cases employing hypothesis spaces with tractable Rashomon Sets (Additive models, Kernel Ridge, and Random Forests) and show that partial orders allow one to extract consistent local and global interpretations of models despite their under-specification.

Partial Order in Chaos: Consensus on Feature Attributions in the Rashomon Set

TL;DR

The paper tackles the problem of post-hoc explanations under model under-specification by shifting from single-model attributions to statements that hold across the Rashomon Set of all good models, yielding partial orders over local and global feature importance. It formalizes a consensus-based framework that uses optimization over ellipsoids and combinatorial relaxations to derive trustworthy local and global attribution relations, while introducing error-tolerance controls (capture bounds and heuristics) and sensitivity analyses. The authors instantiate the framework across Additive Regression, Kernel Ridge, and Random Forests, deriving practical procedures for local and global consensus and demonstrating, on real datasets like Kaggle-Houses and Adult-Income, that partial orders provide robust, cautious interpretations even when under-specification is high. The work highlights that consensus-based partial orders can preserve informative explanations while avoiding overconfident or conflicting claims, with broad implications for interpretability in ML deployments. Overall, the approach offers a principled path to reliable explanations that respect uncertainty in model choice and data noise, potentially guiding safer decision-making in high-stakes domains.

Abstract

Post-hoc global/local feature attribution methods are progressively being employed to understand the decisions of complex machine learning models. Yet, because of limited amounts of data, it is possible to obtain a diversity of models with good empirical performance but that provide very different explanations for the same prediction, making it hard to derive insight from them. In this work, instead of aiming at reducing the under-specification of model explanations, we fully embrace it and extract logical statements about feature attributions that are consistent across all models with good empirical performance (i.e. all models in the Rashomon Set). We show that partial orders of local/global feature importance arise from this methodology enabling more nuanced interpretations by allowing pairs of features to be incomparable when there is no consensus on their relative importance. We prove that every relation among features present in these partial orders also holds in the rankings provided by existing approaches. Finally, we present three use cases employing hypothesis spaces with tractable Rashomon Sets (Additive models, Kernel Ridge, and Random Forests) and show that partial orders allow one to extract consistent local and global interpretations of models despite their under-specification.

Paper Structure

This paper contains 61 sections, 9 theorems, 81 equations, 17 figures, 2 tables.

Key Result

Proposition 8

Under the assumption that the data were generated by the optimal model $h^\star$ plus iid zero-mean Gaussian noise and using the squared loss $\ell(y',y)=(y' - y)^2$, we have that where $F_{\chi^2_N}$ is the CDF of a chi-2 random variable with $N$ degrees of freedom. The proof is provided in Appendix app:proofs:statistical.

Figures (17)

  • Figure 1: Left: local feature attributions for the average model $h_E$ (orange line) and each individual model (blue lines). Right: Partial order of local feature importance. There is a directed path from feature $x_i$ to feature $x_j$ if all good models agree that feature $x_i$ is more important than $x_j$.
  • Figure 2: Residuals Analysis of $h_S$. (Left) Residual as a function of the prediction to assess homogeneity. The horizontal lines represent the $25^\text{th}$, $50^\text{th}$, and $75^\text{th}$ percentiles for three different prediction bins. (Right) Histogram of the residuals and fitted densities.
  • Figure 3: (Left) Sensitivity Analysis regarding the choice of $\epsilon$. The median partial-order cardinalities are shown as a function of the tolerance on training RMSE. The two curves represent whether or not we group correlated features together. (Right) Local Feature Attributions of models sampled from the Rashomon Set boundary. We observe a trade-off between local attributions of correlated features.
  • Figure 4: Local feature attributions of a house with a below-average price. (Top) Without grouping. (Bottom) With grouping.
  • Figure 5: Global Feature Importance of the Kaggle-Houses dataset. (Top) Without grouping. (Bottom) With grouping.
  • ...and 12 more figures

Theorems & Definitions (19)

  • Definition 1: Rashomon Set
  • Definition 2: Positive (Negative) Gap
  • Definition 3: Positive (Negative) Attribution
  • Definition 4: Local Relative Importance
  • Definition 5: Local Feature Attribution Consensus
  • Definition 6: Global Relative Importance
  • Definition 7: Global Feature Importance Consensus
  • Proposition 8
  • Proposition 9
  • Proposition 10
  • ...and 9 more