Enhancing Robustness in Incremental Learning with Adversarial Training
Seungju Cho, Hongsin Lee, Changick Kim
TL;DR
ARCIL tackles robustness in continual learning by introducing FLAIR, a framework that combines separated logits, flatness-preserving distillation, and data augmentation to mitigate conflicts between acquiring new knowledge and preserving past robustness. The method uses an adversarial distillation objective with separated logits (ADSL), preserves gradient and Hessian information via flatness-preserving distillation (FPD), and leverages augmentation to address data scarcity, formalized in the loss $l_{FLAIR}(x,y)= l_{BCE}([f_t(x_{adv})]^t_{t-1},y) + \alpha \cdot l_{BCE}([f_t(x_{adv})]^{t-1}_{0}, f_{t-1}(x_{adv})) + \beta \cdot l_{FPD}(x, x_{adv}; f_t, f_{t-1})$. Experiments across S-CIFAR10/100 and S-SVHN (with and without memory buffers) show that FLAIR achieves the best clean and robust accuracy and highest robust backward transfer, significantly outperforming AT, adversarial distillation baselines, and existing ARCIL methods. This work advances practical robustness in real-world continual learning by addressing three ARCIL-specific challenges and providing reproducible code.
Abstract
Adversarial training is one of the most effective approaches against adversarial attacks. However, adversarial training has primarily been studied in scenarios where data for all classes is provided, with limited research conducted in the context of incremental learning where knowledge is introduced sequentially. In this study, we investigate Adversarially Robust Class Incremental Learning (ARCIL), which deals with adversarial robustness in incremental learning. We first explore a series of baselines that integrate incremental learning with existing adversarial training methods, finding that they lead to conflicts between acquiring new knowledge and retaining past knowledge. Furthermore, we discover that training new knowledge causes the disappearance of a key characteristic in robust models: a flat loss landscape in input space. To address such issues, we propose a novel and robust baseline for ARCIL, named \textbf{FL}atness-preserving \textbf{A}dversarial \textbf{I}ncremental learning for \textbf{R}obustness (\textbf{FLAIR}). Experimental results demonstrate that FLAIR significantly outperforms other baselines. To the best of our knowledge, we are the first to comprehensively investigate the baselines, challenges, and solutions for ARCIL, which we believe represents a significant advance toward achieving real-world robustness. Codes are available at \url{https://github.com/HongsinLee/FLAIR}.
