ZePT: Zero-Shot Pan-Tumor Segmentation via Query-Disentangling and Self-Prompting
Yankai Jiang, Zhongzhen Huang, Rongzhao Zhang, Xiaofan Zhang, Shaoting Zhang
TL;DR
ZePT tackles the long-tailed, multi-organ tumor segmentation challenge by introducing a two-stage, query-disentangling framework. Stage-I learns organ-centric fundamental queries to build robust organ representations, while Stage-II uses self-generated visual prompts to guide advanced tumor queries, aided by cross-modal query-knowledge alignment with medical-domain text embeddings. The approach yields state-of-the-art zero-shot tumor segmentation on MSD and a real-world colon dataset, with strong improvements in AUROC, FPR$_{95}$, and DSC and competitive performance on seen organs. These findings highlight the practical potential of zero-shot pan-tumor segmentation in clinical settings and suggest avenues for further improvement via data augmentation, cross-modal knowledge, and modality expansion.
Abstract
The long-tailed distribution problem in medical image analysis reflects a high prevalence of common conditions and a low prevalence of rare ones, which poses a significant challenge in developing a unified model capable of identifying rare or novel tumor categories not encountered during training. In this paper, we propose a new zero-shot pan-tumor segmentation framework (ZePT) based on query-disentangling and self-prompting to segment unseen tumor categories beyond the training set. ZePT disentangles the object queries into two subsets and trains them in two stages. Initially, it learns a set of fundamental queries for organ segmentation through an object-aware feature grouping strategy, which gathers organ-level visual features. Subsequently, it refines the other set of advanced queries that focus on the auto-generated visual prompts for unseen tumor segmentation. Moreover, we introduce query-knowledge alignment at the feature level to enhance each query's discriminative representation and generalizability. Extensive experiments on various tumor segmentation tasks demonstrate the performance superiority of ZePT, which surpasses the previous counterparts and evidence the promising ability for zero-shot tumor segmentation in real-world settings.
