BECoTTA: Input-dependent Online Blending of Experts for Continual Test-time Adaptation
Daeun Lee, Jaehong Yoon, Sung Ju Hwang
TL;DR
BECoTTA addresses continual test-time adaptation by introducing Mixture-of-Domain Low-rank Experts (MoDE) with domain-adaptive routing and a Domain-Expert Synergy Loss to enable input-dependent, sparse updates that preserve past knowledge. It also introduces the Continual Gradual Shifts (CGS) benchmark to evaluate adaptation under gradual domain changes. Empirically, BECoTTA and its SDA-enhanced BECoTTA+ outperform strong CTTA baselines across disjoint and gradual shifts while dramatically reducing trainable parameters, with strong performance on segmentation, classification, and zero-shot domain generalization. The approach is well-suited for edge devices and real-world deployment due to its modularity, efficiency, and domain-aware specialization.
Abstract
Continual Test Time Adaptation (CTTA) is required to adapt efficiently to continuous unseen domains while retaining previously learned knowledge. However, despite the progress of CTTA, it is still challenging to deploy the model with improved forgetting-adaptation trade-offs and efficiency. In addition, current CTTA scenarios assume only the disjoint situation, even though real-world domains are seamlessly changed. To address these challenges, this paper proposes BECoTTA, an input-dependent and efficient modular framework for CTTA. We propose Mixture-of Domain Low-rank Experts (MoDE) that contains two core components: (i) Domain-Adaptive Routing, which helps to selectively capture the domain adaptive knowledge with multiple domain routers, and (ii) Domain-Expert Synergy Loss to maximize the dependency between each domain and expert. We validate that our method outperforms multiple CTTA scenarios, including disjoint and gradual domain shits, while only requiring ~98% fewer trainable parameters. We also provide analyses of our method, including the construction of experts, the effect of domain-adaptive experts, and visualizations.
