Less-forgetting Learning in Deep Neural Networks
Heechul Jung, Jeongwoo Ju, Minju Jung, Junmo Kim
TL;DR
Catastrophic forgetting in DNNs during domain adaptation is a major challenge, especially when source data are unavailable during target learning. The authors propose Less-forgetting Learning (LF), which freezes the classifier boundary and minimizes a joint loss combining cross-entropy on target data with an Euclidean penalty that aligns hidden features with those of the source network. They further extend the approach to general SGD training by addressing forgetting between mini-batches and introducing an alternating training scheme. Empirical results on CIFAR-10, MNIST, and SVHN show LF improves retention of source-domain information and enhances generalization compared with traditional transfer learning and activation-function-based methods.
Abstract
A catastrophic forgetting problem makes deep neural networks forget the previously learned information, when learning data collected in new environments, such as by different sensors or in different light conditions. This paper presents a new method for alleviating the catastrophic forgetting problem. Unlike previous research, our method does not use any information from the source domain. Surprisingly, our method is very effective to forget less of the information in the source domain, and we show the effectiveness of our method using several experiments. Furthermore, we observed that the forgetting problem occurs between mini-batches when performing general training processes using stochastic gradient descent methods, and this problem is one of the factors that degrades generalization performance of the network. We also try to solve this problem using the proposed method. Finally, we show our less-forgetting learning method is also helpful to improve the performance of deep neural networks in terms of recognition rates.
