Test-time generative augmentation for medical image segmentation

Xiao Ma; Yuhui Tao; Zetian Zhang; Yuhan Zhang; Xi Wang; Sheng Zhang; Zexuan Ji; Yizhe Zhang; Qiang Chen; Guang Yang

Test-time generative augmentation for medical image segmentation

Xiao Ma, Yuhui Tao, Zetian Zhang, Yuhan Zhang, Xi Wang, Sheng Zhang, Zexuan Ji, Yizhe Zhang, Qiang Chen, Guang Yang

TL;DR

This work tackles uncertainty and robustness in medical image segmentation during inference due to occlusions, boundary ambiguity, and cross-device variations. It proposes TTGA, a test-time generative augmentation framework that uses a domain-adapted diffusion model and masked null-text inversion to create region-specific augmentations conditioned on semantic context and image identity. It integrates dual denoising paths with region masks and multi-condition guidance to balance content preservation with meaningful variability, validated across three tasks and multiple datasets, showing consistent segmentation accuracy gains and improved pixel-wise uncertainty estimation. The approach offers a practical, scalable tool for more reliable clinical image analysis under domain shift and data variability.

Abstract

Medical image segmentation is critical for clinical diagnosis, treatment planning, and monitoring, yet segmentation models often struggle with uncertainties stemming from occlusions, ambiguous boundaries, and variations in imaging devices. Traditional test-time augmentation (TTA) techniques typically rely on predefined geometric and photometric transformations, limiting their adaptability and effectiveness in complex medical scenarios. In this study, we introduced Test-Time Generative Augmentation (TTGA), a novel augmentation strategy specifically tailored for medical image segmentation at inference time. Different from conventional augmentation strategies that suffer from excessive randomness or limited flexibility, TTGA leverages a domain-fine-tuned generative model to produce contextually relevant and diverse augmentations tailored to the characteristics of each test image. Built upon diffusion model inversion, a masked null-text inversion method is proposed to enable region-specific augmentations during sampling. Furthermore, a dual denoising pathway is designed to balance precise identity preservation with controlled variability. We demonstrate the efficacy of our TTGA through extensive experiments across three distinct segmentation tasks spanning nine datasets. Our results consistently demonstrate that TTGA not only improves segmentation accuracy (with DSC gains ranging from 0.1% to 2.3% over the baseline) but also offers pixel-wise error estimation (with DSC gains ranging from 1.1% to 29.0% over the baseline). The source code and demonstration are available at: https://github.com/maxiao0234/TTGA.

Test-time generative augmentation for medical image segmentation

TL;DR

Abstract

Test-time generative augmentation for medical image segmentation

TL;DR

Abstract

Paper Structure

Table of Contents

Figures (6)