MambaLRP: Explaining Selective State Space Sequence Models

Farnoush Rezaei Jafari; Grégoire Montavon; Klaus-Robert Müller; Oliver Eberle

MambaLRP: Explaining Selective State Space Sequence Models

Farnoush Rezaei Jafari, Grégoire Montavon, Klaus-Robert Müller, Oliver Eberle

TL;DR

MambaLRP introduces a conservation-based Layer-wise Relevance Propagation framework tailored to Mamba selective state-space models, addressing non-conservative layers (SiLU, SSM, multiplicative gates) with principled propagation rules. The method yields faithful, efficient explanations that outperform gradient- and attention-based baselines across NLP and vision tasks, while enabling bias analysis and evaluation of long-range dependencies. By validating conservation and demonstrating practical use cases, the work enhances trust and interpretability for linear-time sequence models. The approach offers a scalable toolset for debugging, fairness assessment, and deeper insight into the mechanisms of Mamba architectures and related structured state-space models.

Abstract

Recent sequence modeling approaches using selective state space sequence models, referred to as Mamba models, have seen a surge of interest. These models allow efficient processing of long sequences in linear time and are rapidly being adopted in a wide range of applications such as language modeling, demonstrating promising performance. To foster their reliable use in real-world scenarios, it is crucial to augment their transparency. Our work bridges this critical gap by bringing explainability, particularly Layer-wise Relevance Propagation (LRP), to the Mamba architecture. Guided by the axiom of relevance conservation, we identify specific components in the Mamba architecture, which cause unfaithful explanations. To remedy this issue, we propose MambaLRP, a novel algorithm within the LRP framework, which ensures a more stable and reliable relevance propagation through these components. Our proposed method is theoretically sound and excels in achieving state-of-the-art explanation performance across a diverse range of models and datasets. Moreover, MambaLRP facilitates a deeper inspection of Mamba architectures, uncovering various biases and evaluating their significance. It also enables the analysis of previous speculations regarding the long-range capabilities of Mamba models.

MambaLRP: Explaining Selective State Space Sequence Models

TL;DR

Abstract

Paper Structure (46 sections, 3 theorems, 15 equations, 13 figures, 11 tables, 2 algorithms)

This paper contains 46 sections, 3 theorems, 15 equations, 13 figures, 11 tables, 2 algorithms.

Introduction
Related Work
Background
Selective SSMs (S6)
Layer-wise Relevance Propagation
LRP for Mamba
Relevance propagation in SiLU layers
Relevance propagation in selective SSMs (S6)
Relevance propagation in multiplicative gates
Additional modifications and summary
Experiments
Datasets
Baseline methods
Conservation property
Qualitative evaluation
...and 31 more sections

Key Result

Proposition 4.1

Applying the standard gradient propagation equations yields the following result, which relates the relevance values before and after the activation layer:

Figures (13)

Figure 1: Conceptual steps involved in the design of MambaLRP. (a) Take as a starting point a basic LRP procedure, equivalent to Gradient$\,\times\,$Input. (b) Analyze layers in which the conservation property is violated. (c) Rework the relevance propagation strategy at those layers to achieve conservation. The resulting MambaLRP method enables efficient and faithful explanations.
Figure 2: Unfolded view of SSM, highlighting two subsets of nodes, the relevance of which should be conserved throughout relevance propagation.
Figure 3: Conservation property. The x-axis represents the sum of relevance scores across the input features and the y-axis shows the network's output score. Each point corresponds to one example and its proximity to the blue identity line indicates the extent to which conservation is preserved, with closer alignment suggesting improved conservation.
Figure 4: Explanations generated for a sentence of the SST-2 dataset. Shades of red represent words that positively influence the model's prediction. Conversely, shades of blue reflect negative contributions. The heatmaps of attention-based methods are constrained to non-negative values.
Figure 5: Explanations produced by different explanation methods for images of the ImageNet dataset. AttnRoll and MambaAttr are limited to non-negative heatmap values.
...and 8 more figures

Theorems & Definitions (3)

Proposition 4.1
Proposition 4.2
Proposition 4.3

MambaLRP: Explaining Selective State Space Sequence Models

TL;DR

Abstract

MambaLRP: Explaining Selective State Space Sequence Models

Authors

TL;DR

Abstract

Table of Contents

Key Result

Figures (13)

Theorems & Definitions (3)