Adenocarcinoma Segmentation Using Pre-trained Swin-UNet with Parallel Cross-Attention for Multi-Domain Imaging
Abdul Qayyum, Moona Mazher Imran Razzak, Steven A Niederer
TL;DR
The paper addresses domain shift in histopathology for adenocarcinoma segmentation by proposing a Swin-UNet framework with a pretrained ImageNet encoder and a parallel cross-attention module to enhance cross-organ and cross-scanner generalization. The approach integrates a Position-wise Attention Block and a Multi-Scale Fusion Attention Block (MFAB) to capture multi-scale and cross-domain feature dependencies. Evaluations on the COSAS-2024 benchmark demonstrate strong cross-domain performance, achieving final scores of 0.7469 for Cross-Organ and 0.7597 for Cross-Scanner tasks, underscoring robustness across diverse organs and imaging devices. The work highlights the practical potential of pretrained transformer-based segmentation with cross-attention mechanisms for improving reliability in real-world, multi-domain pathology workflows.
Abstract
Computer aided pathological analysis has been the gold standard for tumor diagnosis, however domain shift is a significant problem in histopathology. It may be caused by variability in anatomical structures, tissue preparation, and imaging processes challenges the robustness of segmentation models. In this work, we present a framework consist of pre-trained encoder with a Swin-UNet architecture enhanced by a parallel cross-attention module to tackle the problem of adenocarcinoma segmentation across different organs and scanners, considering both morphological changes and scanner-induced domain variations. Experiment conducted on Cross-Organ and Cross-Scanner Adenocarcinoma Segmentation challenge dataset showed that our framework achieved segmentation scores of 0.7469 for the cross-organ track and 0.7597 for the cross-scanner track on the final challenge test sets, and effectively navigates diverse imaging conditions and improves segmentation accuracy across varying domains.
