Table of Contents
Fetching ...

SegMate: Asymmetric Attention-Based Lightweight Architecture for Efficient Multi-Organ Segmentation

Andrei-Alexandru Bunea, Dan-Matei Popovici, Radu Tudor Ionescu

TL;DR

SegMate is presented, an efficient 2.5D framework that achieves state-of-the-art accuracy, while considerably reducing computational requirements, by meticulously integrating asymmetric architectures, attention mechanisms, multi-scale feature fusion, slice-based positional conditioning, and multi-task optimization.

Abstract

State-of-the-art models for medical image segmentation achieve excellent accuracy but require substantial computational resources, limiting deployment in resource-constrained clinical settings. We present SegMate, an efficient 2.5D framework that achieves state-of-the-art accuracy, while considerably reducing computational requirements. Our efficient design is the result of meticulously integrating asymmetric architectures, attention mechanisms, multi-scale feature fusion, slice-based positional conditioning, and multi-task optimization. We demonstrate the efficiency-accuracy trade-off of our framework across three modern backbones (EfficientNetV2-M, MambaOut-Tiny, FastViT-T12). We perform experiments on three datasets: TotalSegmentator, SegTHOR and AMOS22. Compared with the vanilla models, SegMate reduces computation (GFLOPs) by up to 2.5x and memory footprint (VRAM) by up to 2.1x, while generally registering performance gains of around 1%. On TotalSegmentator, we achieve a Dice score of 93.51% with only 295MB peak GPU memory. Zero-shot cross-dataset evaluations on SegTHOR and AMOS22 demonstrate strong generalization, with Dice scores of up to 86.85% and 89.35%, respectively. We release our open-source code at https://github.com/andreibunea99/SegMate.

SegMate: Asymmetric Attention-Based Lightweight Architecture for Efficient Multi-Organ Segmentation

TL;DR

SegMate is presented, an efficient 2.5D framework that achieves state-of-the-art accuracy, while considerably reducing computational requirements, by meticulously integrating asymmetric architectures, attention mechanisms, multi-scale feature fusion, slice-based positional conditioning, and multi-task optimization.

Abstract

State-of-the-art models for medical image segmentation achieve excellent accuracy but require substantial computational resources, limiting deployment in resource-constrained clinical settings. We present SegMate, an efficient 2.5D framework that achieves state-of-the-art accuracy, while considerably reducing computational requirements. Our efficient design is the result of meticulously integrating asymmetric architectures, attention mechanisms, multi-scale feature fusion, slice-based positional conditioning, and multi-task optimization. We demonstrate the efficiency-accuracy trade-off of our framework across three modern backbones (EfficientNetV2-M, MambaOut-Tiny, FastViT-T12). We perform experiments on three datasets: TotalSegmentator, SegTHOR and AMOS22. Compared with the vanilla models, SegMate reduces computation (GFLOPs) by up to 2.5x and memory footprint (VRAM) by up to 2.1x, while generally registering performance gains of around 1%. On TotalSegmentator, we achieve a Dice score of 93.51% with only 295MB peak GPU memory. Zero-shot cross-dataset evaluations on SegTHOR and AMOS22 demonstrate strong generalization, with Dice scores of up to 86.85% and 89.35%, respectively. We release our open-source code at https://github.com/andreibunea99/SegMate.
Paper Structure (5 sections, 3 equations, 2 figures, 3 tables)

This paper contains 5 sections, 3 equations, 2 figures, 3 tables.

Figures (2)

  • Figure 1: Segmentation accuracy vs. GFLOPs per volume ($\approx$162 slices) on TotalSegmentator. Point size indicates VRAM usage. All SegMate variants are below 300 MB.
  • Figure 2: SegMate architecture. CBAM modules are positioned inside the main decoder blocks after encoder-decoder fusion; SE gates nested skip blocks that cascade cross-scale features. FiLM conditions the bottleneck on normalized slice position $z_{\mathrm{norm}}$.