Table of Contents
Fetching ...

CTTA-T: Continual Test-Time Adaptation for Text Understanding via Teacher-Student with a Domain-aware and Generalized Teacher

Tianlun Liu, Zhiliang Tian, Zhen Huang, Xingzhi Zhou, Wanlong Yu, Tianle Liu, Feng Liu, Dongsheng Li

TL;DR

The paper tackles continual test-time adaptation for text understanding by introducing CTTA-T, a four-module framework that combines a teacher-student backbone, a refine-then-filter mechanism based on prediction consistency, a domain-aware teacher with IPCA-driven dynamic accumulation, and a stochastic restoration strategy to preserve generalization. It also provides a CTTA benchmark spanning multiple NLP tasks to evaluate continual domain shifts. Empirical results show CTTA-T achieves state-of-the-art performance and superior stability across diverse task streams, outperforming strong baselines. The work offers a practical path toward robust, online adaptation of NLP systems in evolving domains.

Abstract

Text understanding often suffers from domain shifts. To handle testing domains, domain adaptation (DA) is trained to adapt to a fixed and observed testing domain; a more challenging paradigm, test-time adaptation (TTA), cannot access the testing domain during training and online adapts to the testing samples during testing, where the samples are from a fixed domain. We aim to explore a more practical and underexplored scenario, continual test-time adaptation (CTTA) for text understanding, which involves a sequence of testing (unobserved) domains in testing. Current CTTA methods struggle in reducing error accumulation over domains and enhancing generalization to handle unobserved domains: 1) Noise-filtering reduces accumulated errors but discards useful information, and 2) accumulating historical domains enhances generalization, but it is hard to achieve adaptive accumulation. In this paper, we propose a CTTA-T (continual test-time adaptation for text understanding) framework adaptable to evolving target domains: it adopts a teacher-student framework, where the teacher is domain-aware and generalized for evolving domains. To improve teacher predictions, we propose a refine-then-filter based on dropout-driven consistency, which calibrates predictions and removes unreliable guidance. For the adaptation-generalization trade-off, we construct a domain-aware teacher by dynamically accumulating cross-domain semantics via incremental PCA, which continuously tracks domain shifts. Experiments show CTTA-T excels baselines.

CTTA-T: Continual Test-Time Adaptation for Text Understanding via Teacher-Student with a Domain-aware and Generalized Teacher

TL;DR

The paper tackles continual test-time adaptation for text understanding by introducing CTTA-T, a four-module framework that combines a teacher-student backbone, a refine-then-filter mechanism based on prediction consistency, a domain-aware teacher with IPCA-driven dynamic accumulation, and a stochastic restoration strategy to preserve generalization. It also provides a CTTA benchmark spanning multiple NLP tasks to evaluate continual domain shifts. Empirical results show CTTA-T achieves state-of-the-art performance and superior stability across diverse task streams, outperforming strong baselines. The work offers a practical path toward robust, online adaptation of NLP systems in evolving domains.

Abstract

Text understanding often suffers from domain shifts. To handle testing domains, domain adaptation (DA) is trained to adapt to a fixed and observed testing domain; a more challenging paradigm, test-time adaptation (TTA), cannot access the testing domain during training and online adapts to the testing samples during testing, where the samples are from a fixed domain. We aim to explore a more practical and underexplored scenario, continual test-time adaptation (CTTA) for text understanding, which involves a sequence of testing (unobserved) domains in testing. Current CTTA methods struggle in reducing error accumulation over domains and enhancing generalization to handle unobserved domains: 1) Noise-filtering reduces accumulated errors but discards useful information, and 2) accumulating historical domains enhances generalization, but it is hard to achieve adaptive accumulation. In this paper, we propose a CTTA-T (continual test-time adaptation for text understanding) framework adaptable to evolving target domains: it adopts a teacher-student framework, where the teacher is domain-aware and generalized for evolving domains. To improve teacher predictions, we propose a refine-then-filter based on dropout-driven consistency, which calibrates predictions and removes unreliable guidance. For the adaptation-generalization trade-off, we construct a domain-aware teacher by dynamically accumulating cross-domain semantics via incremental PCA, which continuously tracks domain shifts. Experiments show CTTA-T excels baselines.

Paper Structure

This paper contains 34 sections, 37 equations, 8 figures, 11 tables.

Figures (8)

  • Figure 1: Comparison of three settings for adapting to domain shift. DA, TTA, and CTTA represent progressively more challenging and practical scenarios.
  • Figure 2: Overview of CTTA-T. When a textual sample $x_t$ arrives (blue box), teacher and student predict on it. Low-entropy teacher outputs guide the student via cross-entropy; otherwise, $\mathbf{s}_{\text{max}}$ is computed (bottom-right). Low-$\mathbf{s}_{\text{max}}$ samples are discarded, while high ones combine the consistency distribution with the teacher output for refined student update. The student then updates the teacher (upper top-right), with update weight measured by IPCA domain distance (bottom-left). Finally, part of the teacher’s parameters is randomly restored (lower top-right).
  • Figure 3: Performance of ours and baselines on order 5.
  • Figure 4: Performance of different filtering strategies.
  • Figure 5: Performance of our method and model using fixed weights under domain shift (report EM each step).
  • ...and 3 more figures