Selective Attention-based Modulation for Continual Learning
Giovanni Bellitto, Federica Proietto Salanitri, Matteo Pennisi, Matteo Boschini, Angelo Porrello, Simone Calderara, Simone Palazzo, Concetto Spampinato
TL;DR
The paper tackles catastrophic forgetting in online continual learning by introducing SAM, a biologically-inspired selective attention mechanism that modulates a classification network with a saliency-prediction branch. It uses a two-branch architecture with a shared-alignment saliency encoder and a multiplicative feature modulation, optimized with a combined loss $\mathcal{L}=\mathcal{L}_s + \lambda\mathcal{L}_c$ and with gradients stopped from the classifier loss to the saliency encoder. Experiments on Split Mini-ImageNet and Split FG-ImageNet show SAM consistently boosts performance of state-of-the-art online CL methods (up to ~20 percentage points) and enhances robustness to spurious features and adversarial perturbations. The results support the neuro-inspired view that attention mechanisms can be leveraged to preserve past knowledge while efficiently learning new tasks, and point to extensions to heterogeneous architectures and broader low-level vision tasks.
Abstract
We present SAM, a biologically-plausible selective attention-driven modulation approach to enhance classification models in a continual learning setting. Inspired by neurophysiological evidence that the primary visual cortex does not contribute to object manifold untangling for categorization and that primordial attention biases are still embedded in the modern brain, we propose to employ auxiliary saliency prediction features as a modulation signal to drive and stabilize the learning of a sequence of non-i.i.d. classification tasks. Experimental results confirm that SAM effectively enhances the performance (in some cases up to about twenty percent points) of state-of-the-art continual learning methods, both in class-incremental and task-incremental settings. Moreover, we show that attention-based modulation successfully encourages the learning of features that are more robust to the presence of spurious features and to adversarial attacks than baseline methods. Code is available at: https://github.com/perceivelab/SAM.
