Graph Signal Processing Meets Mamba2: Adaptive Filter Bank via Delta Modulation

Yehjin Shin; Seojin Kim; Noseong Park

Graph Signal Processing Meets Mamba2: Adaptive Filter Bank via Delta Modulation

Yehjin Shin, Seojin Kim, Noseong Park

Abstract

State-space models (SSMs) offer efficient alternatives to attention with linear-time recurrence. Mamba2, a recent SSM-based language model, uses selective input gating and a multi-head structure, enabling parallel computation and strong benchmark performance. However, its multi-head recurrence operates independently without structured utilization or analysis. In this work, we propose a novel method called Hierarchical ADaptive filter bank for Efficient SSMs (HADES), a Graph Signal Processing (GSP)-inspired framework that reinterprets Mamba2 as an adaptive filter bank on a line graph. Our hierarchical architecture introduces two filter types: shared filters for global low-pass behavior and expert filters for local high-pass behavior, achieved through structured bias on the parameter Δ. HADES achieves comparable performance to baseline models including Mamba2 across various benchmarks in language modeling, commonsense reasoning, and long-context retrieval, while using only 58.9% of the original parameters. In this regard, HADES bridges GSP and neural sequence modeling, enabling efficient, hierarchical, and interpretable filtering within state-space models.

Graph Signal Processing Meets Mamba2: Adaptive Filter Bank via Delta Modulation

Abstract

Paper Structure (77 sections, 31 equations, 12 figures, 15 tables)

This paper contains 77 sections, 31 equations, 12 figures, 15 tables.

Introduction
Background
State Space Models (SSMs) and Mamba
Graph Signal Processing (GSP)
Graph Signals and Filtering.
Graph Filter Banks.
SSMs in the perspective of GSP
SSMs as Graph Filters.
Multi-Head SSMs as Filter Banks.
Proposed Method
HADES: Hierarchical Adaptive Filter Bank for Efficient SSMs
Expert Filters.
Shared Filters.
Training Loss Terms
Load Balance Loss.
...and 62 more sections

Figures (12)

Figure 1: Distribution of layer-wise Effective Rank from the spectral responses of Mamba2 and HADES
Figure 2: Architectural Comparison between Mamba2 and HADES. Mamba2 applies all filters uniformly to every input token, whereas HADES employs a routing mechanism that selects and activates filters conditioned on the spectral residual $r_t$ and $\Delta_t$.
Figure 3: Passkey retrieval result of Mamba2 and HADES
Figure 4: Expert filter selection in Passkey Retrieval task
Figure 5: Spectrum of filter inputs and outputs from Mamba2 and HADES. The x-axis represents the Fourier frequency bins, and the y-axis shows the normalized magnitude of the Fourier coefficients, with larger values indicating stronger frequency components (see Appendix \ref{['app:spectrum']} for details).
...and 7 more figures

Graph Signal Processing Meets Mamba2: Adaptive Filter Bank via Delta Modulation

Abstract

Graph Signal Processing Meets Mamba2: Adaptive Filter Bank via Delta Modulation

Authors

Abstract

Table of Contents

Figures (12)