Table of Contents
Fetching ...

HD-TTA: Hypothesis-Driven Test-Time Adaptation for Safer Brain Tumor Segmentation

Kartik Jhawar, Lipo Wang

TL;DR

The paper addresses safety-critical brain tumor segmentation under domain shift, where conventional test-time adaptation can cause negative transfer by blindly updating predictions. It introduces Hypothesis-Driven Test-Time Adaptation (HD-TTA), a decision-oriented framework that generates two geometric hypotheses—compact denoising and diffuse inflation—and uses a Gatekeeper and a representation-guided selector to autonomously choose the safest outcome. In experiments, a BraTS-GLI trained nnU-Net v2 backbone is evaluated on cross-domain BraTS-PED and BraTS-MEN targets, with HD-TTA delivering lower boundary error (HD95) and higher Precision while maintaining Dice relative to strong baselines. The results demonstrate that explicit hypothesis selection mitigates safety risks under distribution shift, enabling safer, more robust clinical deployment, and the framework is modular and extensible to other structured prediction tasks.

Abstract

Standard Test-Time Adaptation (TTA) methods typically treat inference as a blind optimization task, applying generic objectives to all or filtered test samples. In safety-critical medical segmentation, this lack of selectivity often causes the tumor mask to spill into healthy brain tissue or degrades predictions that were already correct. We propose Hypothesis-Driven TTA, a novel framework that reformulates adaptation as a dynamic decision process. Rather than forcing a single optimization trajectory, our method generates intuitive competing geometric hypotheses: compaction (is the prediction noisy? trim artifacts) versus inflation (is the valid tumor under-segmented? safely inflate to recover). It then employs a representation-guided selector to autonomously identify the safest outcome based on intrinsic texture consistency. Additionally, a pre-screening Gatekeeper prevents negative transfer by skipping adaptation on confident cases. We validate this proof-of-concept on a cross-domain binary brain tumor segmentation task, applying a source model trained on adult BraTS gliomas to unseen pediatric and more challenging meningioma target domains. HD-TTA improves safety-oriented outcomes (Hausdorff Distance (HD95) and Precision) over several state-of-the-art representative baselines in the challenging safety regime, reducing the HD95 by approximately 6.4 mm and improving Precision by over 4%, while maintaining comparable Dice scores. These results demonstrate that resolving the safety-adaptation trade-off via explicit hypothesis selection is a viable, robust path for safe clinical model deployment. Code will be made publicly available upon acceptance.

HD-TTA: Hypothesis-Driven Test-Time Adaptation for Safer Brain Tumor Segmentation

TL;DR

The paper addresses safety-critical brain tumor segmentation under domain shift, where conventional test-time adaptation can cause negative transfer by blindly updating predictions. It introduces Hypothesis-Driven Test-Time Adaptation (HD-TTA), a decision-oriented framework that generates two geometric hypotheses—compact denoising and diffuse inflation—and uses a Gatekeeper and a representation-guided selector to autonomously choose the safest outcome. In experiments, a BraTS-GLI trained nnU-Net v2 backbone is evaluated on cross-domain BraTS-PED and BraTS-MEN targets, with HD-TTA delivering lower boundary error (HD95) and higher Precision while maintaining Dice relative to strong baselines. The results demonstrate that explicit hypothesis selection mitigates safety risks under distribution shift, enabling safer, more robust clinical deployment, and the framework is modular and extensible to other structured prediction tasks.

Abstract

Standard Test-Time Adaptation (TTA) methods typically treat inference as a blind optimization task, applying generic objectives to all or filtered test samples. In safety-critical medical segmentation, this lack of selectivity often causes the tumor mask to spill into healthy brain tissue or degrades predictions that were already correct. We propose Hypothesis-Driven TTA, a novel framework that reformulates adaptation as a dynamic decision process. Rather than forcing a single optimization trajectory, our method generates intuitive competing geometric hypotheses: compaction (is the prediction noisy? trim artifacts) versus inflation (is the valid tumor under-segmented? safely inflate to recover). It then employs a representation-guided selector to autonomously identify the safest outcome based on intrinsic texture consistency. Additionally, a pre-screening Gatekeeper prevents negative transfer by skipping adaptation on confident cases. We validate this proof-of-concept on a cross-domain binary brain tumor segmentation task, applying a source model trained on adult BraTS gliomas to unseen pediatric and more challenging meningioma target domains. HD-TTA improves safety-oriented outcomes (Hausdorff Distance (HD95) and Precision) over several state-of-the-art representative baselines in the challenging safety regime, reducing the HD95 by approximately 6.4 mm and improving Precision by over 4%, while maintaining comparable Dice scores. These results demonstrate that resolving the safety-adaptation trade-off via explicit hypothesis selection is a viable, robust path for safe clinical model deployment. Code will be made publicly available upon acceptance.
Paper Structure (7 sections, 3 equations, 3 figures, 2 tables)

This paper contains 7 sections, 3 equations, 3 figures, 2 tables.

Figures (3)

  • Figure 1: Comparison of Standard TTA versus our proposed HD-TTA framework.
  • Figure 2: Overview of the proposed HD-TTA framework. The inference pipeline consists of a selective Gatekeeper, parallel hypothesis generation (e.g., $H_{compact}$ winning to trim spurious false positives, versus $H_{diffuse}$), and an unsupervised representation-guided selector.
  • Figure 3: Qualitative comparison on representative BraTS-MEN (Rows 1-2) and BraTS-PED cases (Rows 3-4).