SpectralMamba-UNet: Frequency-Disentangled State Space Modeling for Texture-Structure Consistent Medical Image Segmentation

Fuhao Zhang; Lei Liu; Jialin Zhang; Ya-Nan Zhang; Nan Mu

SpectralMamba-UNet: Frequency-Disentangled State Space Modeling for Texture-Structure Consistent Medical Image Segmentation

Fuhao Zhang, Lei Liu, Jialin Zhang, Ya-Nan Zhang, Nan Mu

TL;DR

This work proposes SpectralMamba-UNet, a novel frequency-disentangled framework to decouple the learning of structural and textural information in the spectral domain, and introduces a Spectral Channel Reweighting mechanism to form channel-wise frequency-aware attention and a Spectral-Guided Fusion module to achieve adaptively multi-scale fusion in the decoder.

Abstract

Accurate medical image segmentation requires effective modeling of both global anatomical structures and fine-grained boundary details. Recent state space models (e.g., Vision Mamba) offer efficient long-range dependency modeling. However, their one-dimensional serialization weakens local spatial continuity and high-frequency representation. To this end, we propose SpectralMamba-UNet, a novel frequency-disentangled framework to decouple the learning of structural and textural information in the spectral domain. Our Spectral Decomposition and Modeling (SDM) module applies discrete cosine transform to decompose low- and high-frequency features, where low frequency contributes to global contextual modeling via a frequency-domain Mamba and high frequency preserves boundary-sensitive details. To balance spectral contributions, we introduce a Spectral Channel Reweighting (SCR) mechanism to form channel-wise frequency-aware attention, and a Spectral-Guided Fusion (SGF) module to achieve adaptively multi-scale fusion in the decoder. Experiments on five public benchmarks demonstrate consistent improvements across diverse modalities and segmentation targets, validating the effectiveness and generalizability of our approach.

SpectralMamba-UNet: Frequency-Disentangled State Space Modeling for Texture-Structure Consistent Medical Image Segmentation

TL;DR

Abstract

Paper Structure (12 sections, 5 equations, 3 figures, 3 tables, 1 algorithm)

This paper contains 12 sections, 5 equations, 3 figures, 3 tables, 1 algorithm.

Introduction
Methodology
Motivation and Overview
Spectral Decomposition and Modeling
Spectral Channel Reweighting
Spectral-Guided Fusion
Overall Forward Process
Experiments
Datasets and Implementation
Comparison with State-of-the-Art
Ablation Studies
Conclusion

Figures (3)

Figure 1: Architecture of SpectralMamba-UNet. SDM performs spectral decomposition, SCR reweights frequency responses, and SGF enables frequency-guided decoder fusion.
Figure 2: Qualitative comparison on Synapse, ACDC, EAT, IA, and DRIVE (left to right). Compared with representative baselines (c–g), SpectralMamba-UNet (h) produces sharper boundaries and improved topological consistency.
Figure 3: Qualitative comparison of ablation variants. From left to right: input image, ground truth, Baseline, +Freq, +Spatial Mamba, +Freq+SCR+SGF, +SDM, and the complete SpectralMamba-UNet. The full model produces clearer boundaries and improved structural continuity across datasets.

SpectralMamba-UNet: Frequency-Disentangled State Space Modeling for Texture-Structure Consistent Medical Image Segmentation

TL;DR

Abstract

SpectralMamba-UNet: Frequency-Disentangled State Space Modeling for Texture-Structure Consistent Medical Image Segmentation

Authors

TL;DR

Abstract

Table of Contents

Figures (3)