Automatic Domain Adaptation by Transformers in In-Context Learning

Ryuichiro Hataya; Kota Matsui; Masaaki Imaizumi

Automatic Domain Adaptation by Transformers in In-Context Learning

Ryuichiro Hataya, Kota Matsui, Masaaki Imaizumi

TL;DR

The paper addresses the challenge of selecting and applying domain adaptation methods under covariate shift without retraining models at test time. It demonstrates that Transformers in an in-context learning setup can approximate both instance-based (uLSIF/IWL) and feature-based (DANN) UDA algorithms, and can automatically choose the appropriate method based on dataset properties. Theoretical results provide constructive proofs that suitable Transformer architectures can implement the necessary linear and minimax updates with bounded error, while experiments on two-moon and colorized-MNIST tasks show practical gains over traditional UDA baselines. This work suggests that foundation-models, via in-context learning, can serve as adaptive, cross-framework domain adapters, potentially reducing the manual effort required to select and tune transfer techniques in real-world applications.

Abstract

Selecting or designing an appropriate domain adaptation algorithm for a given problem remains challenging. This paper presents a Transformer model that can provably approximate and opt for domain adaptation methods for a given dataset in the in-context learning framework, where a foundation model performs new tasks without updating its parameters at test time. Specifically, we prove that Transformers can approximate instance-based and feature-based unsupervised domain adaptation algorithms and automatically select an algorithm suited for a given dataset. Numerical results indicate that in-context learning demonstrates an adaptive domain adaptation surpassing existing methods.

Automatic Domain Adaptation by Transformers in In-Context Learning

TL;DR

Abstract

Paper Structure (18 sections, 6 theorems, 18 equations, 1 figure)

This paper contains 18 sections, 6 theorems, 18 equations, 1 figure.

Introduction
Preliminary
Unsupervised Domain Adaptation
In-context Learning
Approximating UDA Algorithms by Transformers
Setup and Notion
In-context IWL with uLSIF
In-context DANN
Automatic Algorithm Selection by In-Context UDA
Experiments
Conclusion and Discussion
Proofs
In-context IWL with uLSIF
In-context DANN
In-context UDA Algorithm Selection
...and 3 more sections

Key Result

Theorem 1

Consider $\widehat{f}^{\mathrm{IWL}}$ with ${\bm \phi}$ which is approximable by a sum of ReLUs. Fix $\varepsilon > 0$ arbitrarily and set $L_2$ as satisfying $0<\epsilon\leq B_w/2L_2$. Suppose that an input $(\mathcal{D}_S, \mathcal{D}_T, {\bm{x}}_*)$ satisfies that $\sup_{{\bm{w}}:\left\lVert{\bm{

Figures (1)

Figure 1: (Left) Test accuracy averaged over five runs of Transformer (ICL) and baseline models on the two-moon 2D problem. Decision boundaries of the models are presented when $N=200$. (Right) Test accuracy averaged over five runs on the colorized MNIST.

Theorems & Definitions (11)

Definition 1: $(\varepsilon, R, M, C)$-approximability by sum of ReLUs, bai2023transformers
Theorem 1
Theorem 2
Corollary 1
Theorem 3
proof : Proof of \ref{['thm:ulsif_main']}
Proposition 1
proof : Proof of \ref{['ap:prop:dann']}
Lemma 9
proof : Proof of Lemma \ref{['lem:density_estimation']}
...and 1 more

Automatic Domain Adaptation by Transformers in In-Context Learning

TL;DR

Abstract

Automatic Domain Adaptation by Transformers in In-Context Learning

Authors

TL;DR

Abstract

Table of Contents

Key Result

Figures (1)

Theorems & Definitions (11)