Sum-of-Parts: Self-Attributing Neural Networks with End-to-End Learning of Feature Groups
Weiqiu You, Helen Qu, Marco Gatti, Bhuvnesh Jain, Eric Wong
TL;DR
Sum-of-Parts (SOP) introduces a model-agnostic framework that converts any differentiable model into a group-based Self-Attributing Neural Network by learning feature groups end-to-end without supervision. The authors prove that per-feature SANNs incur fundamental, dimensionally exploding errors for correlated features, while group-based SANNs can achieve zero error when groups align with data correlations. SOP combines a learnable Group Generator, a Backbone Predictor, and a SparseCrossAttn-based Group Selector to produce interpretable, sparse, and faithful attributions, achieving state-of-the-art performance among SANNs on Vision and Language tasks and demonstrating practical utility in model debugging and cosmological discovery. The framework is validated across ImageNet-S, CosmoGrid, and MultiRC, with detailed analyses showing robust interpretability, semantic coherence, and domain-specific insights, including new cosmological findings about voids and clusters. Overall, SOP provides a scalable path to faithful explanations that preserve predictive performance and support real-world scientific and diagnostic applications.
Abstract
Self-attributing neural networks (SANNs) present a potential path towards interpretable models for high-dimensional problems, but often face significant trade-offs in performance. In this work, we formally prove a lower bound on errors of per-feature SANNs, whereas group-based SANNs can achieve zero error and thus high performance. Motivated by these insights, we propose Sum-of-Parts (SOP), a framework that transforms any differentiable model into a group-based SANN, where feature groups are learned end-to-end without group supervision. SOP achieves state-of-the-art performance for SANNs on vision and language tasks, and we validate that the groups are interpretable on a range of quantitative and semantic metrics. We further validate the utility of SOP explanations in model debugging and cosmological scientific discovery. Our code is available at https://github.com/BrachioLab/sop
