OMENN: One Matrix to Explain Neural Networks
Adam Wróbel, Mikołaj Janusz, Bartosz Zieliński, Dawid Rymarczyk
TL;DR
OMENN tackles the opacity of deep networks by reformulating a trained model for a given input as a single input-dependent affine transformation, yielding an explanation matrix $\mathbf{C} = \mathbf{C_w} \odot \mathbf{X} + \mathbf{C_b}$ whose sum equals the model output. Grounded in the dynamic linearity property, the method derives $\mathbf{C_w}$ and $\mathbf{C_b}$ by fusing layer weights and biases and augmenting the input with a ones channel, enabling exact, locally faithful attributions for modern architectures including ViTs and CNNs. The authors provide theoretical justification and validate OMENN on the FunnyBirds benchmark and the Quantus faithfulness metric, showing state-of-the-art or competitive performance across backbones and tasks, with qualitative analyses illustrating pixel-level contributions aligned with class-specific regions. The work has practical implications for explanations, potential benefits for knowledge distillation and continual learning, and highlights remaining limitations related to non-affine operations and user-centric visualization aspects.
Abstract
Deep Learning (DL) models are often black boxes, making their decision-making processes difficult to interpret. This lack of transparency has driven advancements in eXplainable Artificial Intelligence (XAI), a field dedicated to clarifying the reasoning behind DL model predictions. Among these, attribution-based methods such as LRP and GradCAM are widely used, though they rely on approximations that can be imprecise. To address these limitations, we introduce One Matrix to Explain Neural Networks (OMENN), a novel post-hoc method that represents a neural network as a single, interpretable matrix for each specific input. This matrix is constructed through a series of linear transformations that represent the processing of the input by each successive layer in the neural network. As a result, OMENN provides locally precise, attribution-based explanations of the input across various modern models, including ViTs and CNNs. We present a theoretical analysis of OMENN based on dynamic linearity property and validate its effectiveness with extensive tests on two XAI benchmarks, demonstrating that OMENN is competitive with state-of-the-art methods.
