MasHeNe: A Benchmark for Head and Neck CT Mass Segmentation using Window-Enhanced Mamba with Frequency-Domain Integration

Thao Thi Phuong Dao; Tan-Cong Nguyen; Nguyen Chi Thanh; Truong Hoang Viet; Trong-Le Do; Mai-Khiem Tran; Minh-Khoi Pham; Trung-Nghia Le; Minh-Triet Tran; Thanh Dinh Le

MasHeNe: A Benchmark for Head and Neck CT Mass Segmentation using Window-Enhanced Mamba with Frequency-Domain Integration

Thao Thi Phuong Dao, Tan-Cong Nguyen, Nguyen Chi Thanh, Truong Hoang Viet, Trong-Le Do, Mai-Khiem Tran, Minh-Khoi Pham, Trung-Nghia Le, Minh-Triet Tran, Thanh Dinh Le

TL;DR

<3-5 sentence high-level summary> MasHeNe addresses the gap in public datasets for head-and-neck mass segmentation by introducing a CE-CT dataset with tumors and cysts and establishing a standard benchmark. The paper also proposes WEMF, a Windowing-Enhanced Mamba model that fuses tri-window CT inputs with frequency-domain skip connections, achieving state-of-the-art segmentation metrics on MasHeNe. Quantitative results show WEMF outperforming CNN, Transformer, and other Mamba baselines in Dice, IoU, NSD, and boundary accuracy, while maintaining reasonable efficiency. The authors discuss limitations and outline future work to expand lesion types, multi-center data, and boundary-focused improvements to advance reproducible research in this domain.

Abstract

Head and neck masses are space-occupying lesions that can compress the airway and esophagus and may affect nerves and blood vessels. Available public datasets primarily focus on malignant lesions and often overlook other space-occupying conditions in this region. To address this gap, we introduce MasHeNe, an initial dataset of 3,779 contrast-enhanced CT slices that includes both tumors and cysts with pixel-level annotations. We also establish a benchmark using standard segmentation baselines and report common metrics to enable fair comparison. In addition, we propose the Windowing-Enhanced Mamba with Frequency integration (WEMF) model. WEMF applies tri-window enhancement to enrich the input appearance before feature extraction. It further uses multi-frequency attention to fuse information across skip connections within a U-shaped Mamba backbone. On MasHeNe, WEMF attains the best performance among evaluated methods, with a Dice of 70.45%, IoU of 66.89%, NSD of 72.33%, and HD95 of 5.12 mm. This model indicates stable and strong results on this challenging task. MasHeNe provides a benchmark for head-and-neck mass segmentation beyond malignancy-only datasets. The observed error patterns also suggest that this task remains challenging and requires further research. Our dataset and code are available at https://github.com/drthaodao3101/MasHeNe.git.

MasHeNe: A Benchmark for Head and Neck CT Mass Segmentation using Window-Enhanced Mamba with Frequency-Domain Integration

TL;DR

Abstract

MasHeNe: A Benchmark for Head and Neck CT Mass Segmentation using Window-Enhanced Mamba with Frequency-Domain Integration

TL;DR

Abstract

Paper Structure

Table of Contents

Figures (4)