Table of Contents
Fetching ...

Annotation-Efficient Active Test-Time Adaptation with Conformal Prediction

Tingyu Shi, Fan Lyu, Shaoliang Peng

TL;DR

This paper tackles the inefficiency of Active Test-Time Adaptation (ATTA) under domain shift by introducing CPATTA, which replaces heuristic uncertainty with principled conformal prediction (CP) signals. CPATTA uses smoothed, top-$K$ certainty scores from two CP modules (pretrained and real-time) to guide selective human labeling and confident pseudo-labeling, while a domain-shift detector modulates annotation budgets. An online weighting mechanism using pseudo coverage calibrates CP under evolving domains, supported by a two-stage model update that anchors updates with human labels and expands learning with model labels. Empirical results on PACS, VLCS, and Tiny-ImageNet-C show consistent accuracy gains of about 5% over SOTA ATTA methods and improved data-selection efficiency, with code available at the provided repository. These contributions advance robust, data-efficient test-time adaptation for real-world, shifting environments.

Abstract

Active Test-Time Adaptation (ATTA) improves model robustness under domain shift by selectively querying human annotations at deployment, but existing methods use heuristic uncertainty measures and suffer from low data selection efficiency, wasting human annotation budget. We propose Conformal Prediction Active TTA (CPATTA), which first brings principled, coverage-guaranteed uncertainty into ATTA. CPATTA employs smoothed conformal scores with a top-K certainty measure, an online weight-update algorithm driven by pseudo coverage, a domain-shift detector that adapts human supervision, and a staged update scheme balances human-labeled and model-labeled data. Extensive experiments demonstrate that CPATTA consistently outperforms the state-of-the-art ATTA methods by around 5% in accuracy. Our code and datasets are available at https://github.com/tingyushi/CPATTA.

Annotation-Efficient Active Test-Time Adaptation with Conformal Prediction

TL;DR

This paper tackles the inefficiency of Active Test-Time Adaptation (ATTA) under domain shift by introducing CPATTA, which replaces heuristic uncertainty with principled conformal prediction (CP) signals. CPATTA uses smoothed, top- certainty scores from two CP modules (pretrained and real-time) to guide selective human labeling and confident pseudo-labeling, while a domain-shift detector modulates annotation budgets. An online weighting mechanism using pseudo coverage calibrates CP under evolving domains, supported by a two-stage model update that anchors updates with human labels and expands learning with model labels. Empirical results on PACS, VLCS, and Tiny-ImageNet-C show consistent accuracy gains of about 5% over SOTA ATTA methods and improved data-selection efficiency, with code available at the provided repository. These contributions advance robust, data-efficient test-time adaptation for real-world, shifting environments.

Abstract

Active Test-Time Adaptation (ATTA) improves model robustness under domain shift by selectively querying human annotations at deployment, but existing methods use heuristic uncertainty measures and suffer from low data selection efficiency, wasting human annotation budget. We propose Conformal Prediction Active TTA (CPATTA), which first brings principled, coverage-guaranteed uncertainty into ATTA. CPATTA employs smoothed conformal scores with a top-K certainty measure, an online weight-update algorithm driven by pseudo coverage, a domain-shift detector that adapts human supervision, and a staged update scheme balances human-labeled and model-labeled data. Extensive experiments demonstrate that CPATTA consistently outperforms the state-of-the-art ATTA methods by around 5% in accuracy. Our code and datasets are available at https://github.com/tingyushi/CPATTA.

Paper Structure

This paper contains 9 sections, 17 equations, 3 figures, 4 tables.

Figures (3)

  • Figure 1: Comparison between existing ATTA methods and CPATTA. Selection efficiency is defined as the fraction of useful samples: correct predictions for model-annotated data and incorrect predictions for human-annotated data. CPATTA achieves higher selection efficiency in both cases, enabling better real-time and post-adaptation performance under the same annotation budget.
  • Figure 2: Overview of CPATTA method. The real-time model $f(\cdot;\theta)$ makes the predictions right away when an online batch of data arrives. If CPATTA detects that current batch's domain is different from the previous batch's domain($\mathcal{D}_t \neq \mathcal{D}_{t-1}$), the algorithm selects more data for human annotation to ensure the real-time performance. Two CPs provide uncertainty measures for each sample within the batch; human annotates uncertain samples while the model annotates certain samples. Then, CPATTA updates the weights of two CPs based on the pseudo coverages calculated from model predictions and prediction sets.
  • Figure 3: Compare our CP with other CPs on VLCS