Table of Contents
Fetching ...

Domain Adaptations for Computer Vision Applications

Oscar Beijbom

TL;DR

The paper surveys domain adaptation methods for computer vision, focusing on how labeled data from a source domain can be leveraged to improve target domain predictions under distribution shifts. It categorizes approaches into instance weighting, source priors, common representations, transfer learning, and multi‑modal learning, detailing theory, algorithms (e.g., kernel mean matching, MMD, ITML), and CV applications. Key contributions include clarifying relaxations such as covariate shift and class imbalance, and presenting unified frameworks like Generalized Multiview Analysis for cross‑modal data. The work highlights practical strategies for CV tasks where labeling is expensive or data distributions change over time or modalities.

Abstract

A basic assumption of statistical learning theory is that train and test data are drawn from the same underlying distribution. Unfortunately, this assumption doesn't hold in many applications. Instead, ample labeled data might exist in a particular `source' domain while inference is needed in another, `target' domain. Domain adaptation methods leverage labeled data from both domains to improve classification on unseen data in the target domain. In this work we survey domain transfer learning methods for various application domains with focus on recent work in Computer Vision.

Domain Adaptations for Computer Vision Applications

TL;DR

The paper surveys domain adaptation methods for computer vision, focusing on how labeled data from a source domain can be leveraged to improve target domain predictions under distribution shifts. It categorizes approaches into instance weighting, source priors, common representations, transfer learning, and multi‑modal learning, detailing theory, algorithms (e.g., kernel mean matching, MMD, ITML), and CV applications. Key contributions include clarifying relaxations such as covariate shift and class imbalance, and presenting unified frameworks like Generalized Multiview Analysis for cross‑modal data. The work highlights practical strategies for CV tasks where labeling is expensive or data distributions change over time or modalities.

Abstract

A basic assumption of statistical learning theory is that train and test data are drawn from the same underlying distribution. Unfortunately, this assumption doesn't hold in many applications. Instead, ample labeled data might exist in a particular `source' domain while inference is needed in another, `target' domain. Domain adaptation methods leverage labeled data from both domains to improve classification on unseen data in the target domain. In this work we survey domain transfer learning methods for various application domains with focus on recent work in Computer Vision.

Paper Structure

This paper contains 17 sections, 30 equations, 3 figures, 2 tables.

Figures (3)

  • Figure 1: The MEGA model proposed in hal_domain. This model assumes the data is in fact generated by three distributions, a target, a common and and source. The MEGA model learns a classifier for each space. Left is the standard logistic regression model.
  • Figure 2: Figure from Douglas_A._Speaker illustrating the adapted GMM model. The left figure shows the universal GMM estimated from the background data together with the speaker-specific train data. The right shows the adapted model.
  • Figure 3: Figure from raghuraman_domain illustrating the proposed method.