Table of Contents
Fetching ...

One-Shot Domain Incremental Learning

Yasushi Esaki, Satoshi Koide, Takuro Kutsuna

TL;DR

This work defines one-shot domain incremental learning (DIL), where a pre-trained model must adapt to a new domain based on a single example while preserving performance on the original domains. It demonstrates that standard DIL methods like EWC and GEM fail in this setting due to drifting batch normalization statistics, and identifies BN statistics as the root cause. The authors propose a BN-centric fix: freeze the moving averages learned from the original data and perform training with these fixed statistics, using data augmentation to generate multiple pseudo-samples from the single new-domain example and a memory buffer to maintain original-domain performance. Experiments on MNIST, CIFAR10, and RESISC45 show that the fixed BN statistics approach significantly improves new-domain accuracy with limited or no degradation on the original domain, establishing a practical baseline for one-shot DIL and enabling effective combination with existing DIL methods.

Abstract

Domain incremental learning (DIL) has been discussed in previous studies on deep neural network models for classification. In DIL, we assume that samples on new domains are observed over time. The models must classify inputs on all domains. In practice, however, we may encounter a situation where we need to perform DIL under the constraint that the samples on the new domain are observed only infrequently. Therefore, in this study, we consider the extreme case where we have only one sample from the new domain, which we call one-shot DIL. We first empirically show that existing DIL methods do not work well in one-shot DIL. We have analyzed the reason for this failure through various investigations. According to our analysis, we clarify that the difficulty of one-shot DIL is caused by the statistics in the batch normalization layers. Therefore, we propose a technique regarding these statistics and demonstrate the effectiveness of our technique through experiments on open datasets. The code is available at https://github.com/ToyotaCRDL/OneShotDIL.

One-Shot Domain Incremental Learning

TL;DR

This work defines one-shot domain incremental learning (DIL), where a pre-trained model must adapt to a new domain based on a single example while preserving performance on the original domains. It demonstrates that standard DIL methods like EWC and GEM fail in this setting due to drifting batch normalization statistics, and identifies BN statistics as the root cause. The authors propose a BN-centric fix: freeze the moving averages learned from the original data and perform training with these fixed statistics, using data augmentation to generate multiple pseudo-samples from the single new-domain example and a memory buffer to maintain original-domain performance. Experiments on MNIST, CIFAR10, and RESISC45 show that the fixed BN statistics approach significantly improves new-domain accuracy with limited or no degradation on the original domain, establishing a practical baseline for one-shot DIL and enabling effective combination with existing DIL methods.

Abstract

Domain incremental learning (DIL) has been discussed in previous studies on deep neural network models for classification. In DIL, we assume that samples on new domains are observed over time. The models must classify inputs on all domains. In practice, however, we may encounter a situation where we need to perform DIL under the constraint that the samples on the new domain are observed only infrequently. Therefore, in this study, we consider the extreme case where we have only one sample from the new domain, which we call one-shot DIL. We first empirically show that existing DIL methods do not work well in one-shot DIL. We have analyzed the reason for this failure through various investigations. According to our analysis, we clarify that the difficulty of one-shot DIL is caused by the statistics in the batch normalization layers. Therefore, we propose a technique regarding these statistics and demonstrate the effectiveness of our technique through experiments on open datasets. The code is available at https://github.com/ToyotaCRDL/OneShotDIL.
Paper Structure (20 sections, 4 figures, 5 tables, 1 algorithm)

This paper contains 20 sections, 4 figures, 5 tables, 1 algorithm.

Figures (4)

  • Figure 1: Example of the original domain and the new domain in one-shot domain incremental learning (one-shot DIL) with CIFAR10 CIFAR. In this example, trucks are added to the "automobile" class as the new domain. However, only one sample is added. The procedure to set datasets in one-shot DIL is described in Section \ref{['secdataset']}.
  • Figure 2: Transition of the moving averages of the mean at the batch normalization layer closest to the input layer in ResNet18 ResNet. In training, the moving averages of the statistics in the batch normalization layer are repeatedly updated at every forward propagation. Therefore, we accumulated the moving averages of the statistics at every forward propagation and plotted their sequences. Since the input is normalized in parallel for each channel in the batch normalization layer, we computed the average for the channels.
  • Figure 3: Transition of the moving averages of the variance at the batch normalization layer closest to the input layer in ResNet18 ResNet. The settings of the plot are the same as those shown in Fig. \ref{['figmean']}.
  • Figure : Our one-shot DIL algorithm with improved batch normalization.