Table of Contents
Fetching ...

Learning from Many and Adapting to the Unknown in Open-set Test Streams

Xiao Zhang, Juntao Lyu, Tianyu Hu, Qianchuan Zhao, Huimin Ma

Abstract

Large Language Models (LLMs) generalize across tasks via reusable representations and flexible reasoning, yet remain brittle in real deployment under evolving tasks and continual distribution shift. A common approach is Test-Time Adaptation (TTA), existing ones of which updates models with hand-designed unsupervised objectives over the full parameter space and mostly overlook preserving shared source knowledge and the reliability of adaptation signals. Drawing on molecular signaling cascades of memory updating in Drosophila, we propose Synapse Consolidation (SyCo), a parameter-efficient LLM adaptation method that updates low-rank adapters through Rac1 and MAPK pathways under the guidance of a structured TTA objective driven by problem understanding, process understanding, and source-domain guardrail. Rac1 confines plasticity to a tail-gradient subspace that is less critical for source knowledge, enabling rapid specialization while preserving source representations. MAPK uses a tiered controller to suppress noisy updates and consolidate useful adaptations under non-stationary streams. To model real deployments with multiple sources and continually emerging tasks, we introduce Multi-source Open-set Adaptation (MOA) setting, where a model is trained on multiple labeled source tasks and then adapts on open, non-stationary unlabeled test streams that mix seen and unseen tasks with partial overlap in label and intent space. Across 18 NLP datasets and the MOA setting, SyCo consistently outperforms strong baselines, achieving 78.31\% on unseen-task adaptation and 85.37\% on unseen-data shifts.

Learning from Many and Adapting to the Unknown in Open-set Test Streams

Abstract

Large Language Models (LLMs) generalize across tasks via reusable representations and flexible reasoning, yet remain brittle in real deployment under evolving tasks and continual distribution shift. A common approach is Test-Time Adaptation (TTA), existing ones of which updates models with hand-designed unsupervised objectives over the full parameter space and mostly overlook preserving shared source knowledge and the reliability of adaptation signals. Drawing on molecular signaling cascades of memory updating in Drosophila, we propose Synapse Consolidation (SyCo), a parameter-efficient LLM adaptation method that updates low-rank adapters through Rac1 and MAPK pathways under the guidance of a structured TTA objective driven by problem understanding, process understanding, and source-domain guardrail. Rac1 confines plasticity to a tail-gradient subspace that is less critical for source knowledge, enabling rapid specialization while preserving source representations. MAPK uses a tiered controller to suppress noisy updates and consolidate useful adaptations under non-stationary streams. To model real deployments with multiple sources and continually emerging tasks, we introduce Multi-source Open-set Adaptation (MOA) setting, where a model is trained on multiple labeled source tasks and then adapts on open, non-stationary unlabeled test streams that mix seen and unseen tasks with partial overlap in label and intent space. Across 18 NLP datasets and the MOA setting, SyCo consistently outperforms strong baselines, achieving 78.31\% on unseen-task adaptation and 85.37\% on unseen-data shifts.

Paper Structure

This paper contains 27 sections, 1 theorem, 21 equations, 4 figures, 8 tables, 1 algorithm.

Key Result

Theorem 1

Consider adaptation from task $\mathcal{T}^{(m-1)}$ to task $\mathcal{T}^{(m)}$. Let $\mathcal{L}_m(\boldsymbol{\theta})$ be the task loss and assume that $\mathcal{L}_m$ is $\beta$-smooth. Initialize adaptation at $\boldsymbol{\theta}_m^{(0)}=\boldsymbol{\theta}_{m-1}^*$. Let the LoRA update space where $\rho\in[0,1)$. Choosing a constant step size $\eta=1/\beta$, the Rac1 update $\boldsymbol{\t

Figures (4)

  • Figure 1: An analogy between the Rac1–MAPK signaling sequence in biological systems and our biomimetic TTA mechanism. In biology, brain-processed stimuli engage Rac1-associated synaptic remodeling, followed by MAPK signaling that regulates downstream cellular responses. Likewise, in our framework, source-task knowledge induces dedicated subspace activation, which modulates learning signals via an adaptive learning-rate regulation module for controlled adaptation toward the target task space.
  • Figure 2: Comparison of different adaptation paradigms. (a) TTA adapts a pretrained model on unlabeled test streams with distribution shift. (b) MTL jointly trains a model on multiple source tasks to learn shared representations. (c) MOA combines MTL and TTA, training on multiple labeled source tasks and adapting on open-set unlabeled test streams with mixed seen and unseen tasks.
  • Figure 3: Overview of SyCo. The Rac1 pathway selectively resets and re-parameterizes specific components of the gradient subspace to open effective signal channels for unseen tasks. The grey blocks following Rac1 activation represent the designated directions for gradient updates. The MAPK pathway defines tiered activation events based on positive shifts across three consistency metrics—decreasing entropy, increasing likelihood, and increasing consistency.
  • Figure 4: Scaling curves of Qwen3 models with and without TA-LoRA under the MOA setting. (a) Source performance as a function of model size. (b) Target performance as a function of model size.

Theorems & Definitions (3)

  • Theorem 1: Projected stationarity under Rac1
  • Remark 1: Justification of the smoothness assumption used in Theorem \ref{['theorem:convergence_nonconvex']}
  • proof