Feed-Forward Latent Domain Adaptation

Ondrej Bohdal; Da Li; Shell Xu Hu; Timothy Hospedales

Feed-Forward Latent Domain Adaptation

Ondrej Bohdal, Da Li, Shell Xu Hu, Timothy Hospedales

TL;DR

This work introduces Feed-Forward Latent Domain Adaptation (CXDA), a practical framework for adapting a pre-trained model to deployment data comprising multiple latent domains without access to source data and without back-propagation. By meta-learning a cross-attention module that jointly processes the query and a set of unlabeled support examples, CXDA selectively leverages relevant instances to adapt inference on a per-example basis in a streaming, feed-forward manner. Experiments on FEMNIST, CIFAR-C, TinyImageNet-C, and iWildCam show CXDA consistently outperforms strong ERM baselines and many back-propagation methods, with notable robustness to domain mixture and real-time constraints. The results suggest that automated instance selection via cross-attention can surpass risks associated with manual domain labeling, offering practical benefits for edge devices facing real-world domain shifts.

Abstract

We study a new highly-practical problem setting that enables resource-constrained edge devices to adapt a pre-trained model to their local data distributions. Recognizing that device's data are likely to come from multiple latent domains that include a mixture of unlabelled domain-relevant and domain-irrelevant examples, we focus on the comparatively under-studied problem of latent domain adaptation. Considering limitations of edge devices, we aim to only use a pre-trained model and adapt it in a feed-forward way, without using back-propagation and without access to the source data. Modelling these realistic constraints bring us to the novel and practically important problem setting of feed-forward latent domain adaptation. Our solution is to meta-learn a network capable of embedding the mixed-relevance target dataset and dynamically adapting inference for target examples using cross-attention. The resulting framework leads to consistent improvements over strong ERM baselines. We also show that our framework sometimes even improves on the upper bound of domain-supervised adaptation, where only domain-relevant instances are provided for adaptation. This suggests that human annotated domain labels may not always be optimal, and raises the possibility of doing better through automated instance selection.

Feed-Forward Latent Domain Adaptation

TL;DR

Abstract

Paper Structure (23 sections, 5 equations, 5 figures, 8 tables, 2 algorithms)

This paper contains 23 sections, 5 equations, 5 figures, 8 tables, 2 algorithms.

Introduction
Background and related work
Methods
Set-up
Objective
Architecture
Meta-learning
Experiments
Benchmarks
Baselines
Implementation details
Results
Further analysis
Discussion
Conclusion
...and 8 more sections

Figures (5)

Figure 1: Illustration of standard and latent domain adaptation (LDA) settings. In the LDA setting (support) images come from a variety of domains of mixed and unknown relevance to the test (query) image. In standard DA adaptation images are all assumed to be equally relevant.
Figure 2: Illustration of the desired application scenario where a pre-trained model is deployed to many edge devices. Each device utilizes its own data coming from several domains to quickly adapt the model for the current test image.
Figure 3: Analysis of test accuracy (%) vs time per task (ms) for the various approaches evaluated. CXDA achieves the best performance, has similar speed to other feed-forward baselines and is faster than fine-tuning approaches that use back-propagation (1 and 10 adaptation steps are shown for FT-EM and FT-IM). The difference is especially large when the fine-tuning approaches use 10 fine-tuning steps, but even if only 1 step is used there is a visible speed difference. Time per task includes adapting to the task and making a prediction.
Figure 4: Density histograms of attention weights for pairs of same and different domain examples in the test tasks of iWildCam.
Figure 5: Analysis of attention weights for an example task in iWildCam, with a query image coming from location (camera trap) #288. We show the five support examples in each domain that have the largest and smallest attention weights. Similar images from the same location (#288) are given the largest weights, but also relevant images from other locations (e.g. #125) are given larger weights. The examples with the smallest attention weights visually do not seem relevant.

Feed-Forward Latent Domain Adaptation

TL;DR

Abstract

Feed-Forward Latent Domain Adaptation

Authors

TL;DR

Abstract

Table of Contents

Figures (5)