Table of Contents
Fetching ...

S$^3$-TTA: Scale-Style Selection for Test-Time Augmentation in Biomedical Image Segmentation

Kangxian Xie, Siyu Huang, Sebastian Andres Cajas Ordonez, Hanspeter Pfister, Donglai Wei

TL;DR

S$^3$-TTA addresses domain generalization in biomedical image segmentation by selecting a single, optimal test-time augmentation per image through a transformation consistency metric. It combines scale and style augmentations with AdaIN-based style transfer in an end-to-end augmentation-segmentation pipeline, and uses a rotational self-consistency (MAE) to choose the most reliable augmented view for segmentation. The approach yields state-of-the-art improvements on cell nuclei and chest X-ray lung segmentation benchmarks, with reported gains of about 3.4% and 1.3% over prior methods, respectively. This plug-in TTA strategy improves robustness to domain shifts while remaining compatible with existing segmentation models and tasks.

Abstract

Deep-learning models have been successful in biomedical image segmentation. To generalize for real-world deployment, test-time augmentation (TTA) methods are often used to transform the test image into different versions that are hopefully closer to the training domain. Unfortunately, due to the vast diversity of instance scale and image styles, many augmented test images produce undesirable results, thus lowering the overall performance. This work proposes a new TTA framework, S$^3$-TTA, which selects the suitable image scale and style for each test image based on a transformation consistency metric. In addition, S$^3$-TTA constructs an end-to-end augmentation-segmentation joint-training pipeline to ensure a task-oriented augmentation. On public benchmarks for cell and lung segmentation, S$^3$-TTA demonstrates improvements over the prior art by 3.4% and 1.3%, respectively, by simply augmenting the input data in testing phase.

S$^3$-TTA: Scale-Style Selection for Test-Time Augmentation in Biomedical Image Segmentation

TL;DR

S-TTA addresses domain generalization in biomedical image segmentation by selecting a single, optimal test-time augmentation per image through a transformation consistency metric. It combines scale and style augmentations with AdaIN-based style transfer in an end-to-end augmentation-segmentation pipeline, and uses a rotational self-consistency (MAE) to choose the most reliable augmented view for segmentation. The approach yields state-of-the-art improvements on cell nuclei and chest X-ray lung segmentation benchmarks, with reported gains of about 3.4% and 1.3% over prior methods, respectively. This plug-in TTA strategy improves robustness to domain shifts while remaining compatible with existing segmentation models and tasks.

Abstract

Deep-learning models have been successful in biomedical image segmentation. To generalize for real-world deployment, test-time augmentation (TTA) methods are often used to transform the test image into different versions that are hopefully closer to the training domain. Unfortunately, due to the vast diversity of instance scale and image styles, many augmented test images produce undesirable results, thus lowering the overall performance. This work proposes a new TTA framework, S-TTA, which selects the suitable image scale and style for each test image based on a transformation consistency metric. In addition, S-TTA constructs an end-to-end augmentation-segmentation joint-training pipeline to ensure a task-oriented augmentation. On public benchmarks for cell and lung segmentation, S-TTA demonstrates improvements over the prior art by 3.4% and 1.3%, respectively, by simply augmenting the input data in testing phase.
Paper Structure (13 sections, 4 equations, 6 figures, 3 tables)

This paper contains 13 sections, 4 equations, 6 figures, 3 tables.

Figures (6)

  • Figure 1: Scale-and-style-aware TTA. (a-b) Due to the diversity of biomedical images, a pre-trained segmentation model may fail significantly if the test image has an unexpected style or scale. (c) Thus, instead of aggregating over all augmentations, we propose to select the suitable style and scale before the aggregation.
  • Figure 2: Model overview. For a test image, we first apply scale, style augmentations at different angles. We then employ a consistency-based metric to select the best augmented images for segmentation.
  • Figure 3: Scale-style selector. This module measures the rotational-consistency of all augmentations and picks the best for segmentation. Two scales/styles are used for illustration.
  • Figure 4: Qualitative results on cell image segmentation. The fourth row represents our co-training pipeline without augmentation, while the final row presents results visualization with our multi-scale/style method.
  • Figure 5: Qualitative results on X-ray lung segmentation.
  • ...and 1 more figures