Table of Contents
Fetching ...

Many Ways to be Right: Rashomon Sets for Concept-Based Neural Networks

Shihan Feng, Cheng Zhang, Michael Xi, Ethan Hsu, Lesia Semenova, Chudi Zhong

TL;DR

This work addresses the Rashomon effect in deep learning by proposing Rashomon Concept Bottleneck Models (Rashomon CBMs), which construct a diverse slice of near-optimal concept-based networks. The method freezes a shared backbone and injects per-model adapters to induce distinct concept representations, training all models jointly with a diversity-regularized objective and a memory-efficient model-axis checkpointing scheme. Empirical results across AwA2, CUB, CIFAR-10, and CelebA demonstrate competitive accuracy while revealing rich diversity in concept usage and decision pathways, with adapters primarily driving variation in deeper layers. The framework advances interpretability and auditing by enabling systematic inspection of multiple valid reasoning strategies for the same predictions, with practical implications for alignment and safety in AI systems.

Abstract

Modern neural networks rarely have a single way to be right. For many tasks, multiple models can achieve identical performance while relying on different features or reasoning patterns, a property known as the Rashomon Effect. However, uncovering this diversity in deep architectures is challenging as their continuous parameter spaces contain countless near-optimal solutions that are numerically distinct but often behaviorally similar. We introduce Rashomon Concept Bottleneck Models, a framework that learns multiple neural networks which are all accurate yet reason through distinct human-understandable concepts. By combining lightweight adapter modules with a diversity-regularized training objective, our method constructs a diverse set of deep concept-based models efficiently without retraining from scratch. The resulting networks provide fundamentally different reasoning processes for the same predictions, revealing how concept reliance and decision making vary across equally performing solutions. Our framework enables systematic exploration of data-driven reasoning diversity in deep models, offering a new mechanism for auditing, comparison, and alignment across equally accurate solutions.

Many Ways to be Right: Rashomon Sets for Concept-Based Neural Networks

TL;DR

This work addresses the Rashomon effect in deep learning by proposing Rashomon Concept Bottleneck Models (Rashomon CBMs), which construct a diverse slice of near-optimal concept-based networks. The method freezes a shared backbone and injects per-model adapters to induce distinct concept representations, training all models jointly with a diversity-regularized objective and a memory-efficient model-axis checkpointing scheme. Empirical results across AwA2, CUB, CIFAR-10, and CelebA demonstrate competitive accuracy while revealing rich diversity in concept usage and decision pathways, with adapters primarily driving variation in deeper layers. The framework advances interpretability and auditing by enabling systematic inspection of multiple valid reasoning strategies for the same predictions, with practical implications for alignment and safety in AI systems.

Abstract

Modern neural networks rarely have a single way to be right. For many tasks, multiple models can achieve identical performance while relying on different features or reasoning patterns, a property known as the Rashomon Effect. However, uncovering this diversity in deep architectures is challenging as their continuous parameter spaces contain countless near-optimal solutions that are numerically distinct but often behaviorally similar. We introduce Rashomon Concept Bottleneck Models, a framework that learns multiple neural networks which are all accurate yet reason through distinct human-understandable concepts. By combining lightweight adapter modules with a diversity-regularized training objective, our method constructs a diverse set of deep concept-based models efficiently without retraining from scratch. The resulting networks provide fundamentally different reasoning processes for the same predictions, revealing how concept reliance and decision making vary across equally performing solutions. Our framework enables systematic exploration of data-driven reasoning diversity in deep models, offering a new mechanism for auditing, comparison, and alignment across equally accurate solutions.

Paper Structure

This paper contains 30 sections, 4 equations, 14 figures, 6 tables.

Figures (14)

  • Figure 1: Proposed architecture for Rashomon CBMs. (a) Overall structure: an input image passes through a frozen backbone with attached adapters, followed by multiple parallel concept layers, each linked to a classifier. (b) Zoom-in on a backbone block: at each Attention layer, multiple adapter modules are inserted in parallel into the Q, K, V, and projection mappings, operating concurrently and independently. All adapters are trained jointly, while at inference, each model activates only its own adapters while sharing the same backbone weights.
  • Figure 2: Effect of varying the number of models ($M$) from 10 to 25 on accuracy and diversity on CIFAR-10. As $M$ increases, concept similarity and CKA rise while concept accuracy decreases slightly, indicating mild convergence in the representation space.
  • Figure 3: Layerwise eigenvector similarity on QKV matrices (left) and projection matrices (right) between models sharing the same ViT backbone on AwA2. Similarity gradually declines with depth, and diversity concentrates in the adapters of the final few blocks.
  • Figure 4: Layerwise ablation analysis on CIFAR-10 with a ViT backbone. Adapters in each layer are switched from shared to independent one at a time. Task accuracy remains stable, while similarity metrics decrease as deeper layers become independent.
  • Figure 5: Qualitative Analysis of Rashomon CBMs on a tiger image from AwA2 dataset. The left heatmap shows sparse SHAP importance for 5 models, indicating each model relies on different concepts for prediction. The middle heatmap shows the predicted concept probabilities, which are quite consistent across models. The right heatmap visualizes the linear classifier weights assigned to each concept, suggesting that the models differ in how they map concepts to final predictions.
  • ...and 9 more figures