Feature-based Graph Attention Networks Improve Online Continual Learning
Adjovi Sim, Zhengkui Wang, Aik Beng Ng, Shalini De Mello, Simon See, Wonmin Byeon
TL;DR
The paper tackles online continual learning for image classification by introducing Feature-based Graph Attention Networks (FGAT), which transform images into multi-scale feature graphs processed with a GATv2 to capture contextual relations and task-specific dynamics. It combines a channel-wise weighted global mean pooling and SPN normalization to improve pooling stability, and introduces rehearsal memory duplication to balance representation between new and past tasks under fixed memory budgets. Empirical results on SVHN, CIFAR10, CIFAR100, and MiniImageNet show FGAT achieving superior averages over state-of-the-art CNN- and GNN-based CL methods, with notable gains on CIFAR100 and MiniImageNet, and robust performance across online task-incremental settings. The approach provides a parameter-efficient alternative to large transformers, leveraging multi-scale feature maps and attention to model complex image relationships while mitigating catastrophic forgetting in dynamic environments.
Abstract
Online continual learning for image classification is crucial for models to adapt to new data while retaining knowledge of previously learned tasks. This capability is essential to address real-world challenges involving dynamic environments and evolving data distributions. Traditional approaches predominantly employ Convolutional Neural Networks, which are limited to processing images as grids and primarily capture local patterns rather than relational information. Although the emergence of transformer architectures has improved the ability to capture relationships, these models often require significantly larger resources. In this paper, we present a novel online continual learning framework based on Graph Attention Networks (GATs), which effectively capture contextual relationships and dynamically update the task-specific representation via learned attention weights. Our approach utilizes a pre-trained feature extractor to convert images into graphs using hierarchical feature maps, representing information at varying levels of granularity. These graphs are then processed by a GAT and incorporate an enhanced global pooling strategy to improve classification performance for continual learning. In addition, we propose the rehearsal memory duplication technique that improves the representation of the previous tasks while maintaining the memory budget. Comprehensive evaluations on benchmark datasets, including SVHN, CIFAR10, CIFAR100, and MiniImageNet, demonstrate the superiority of our method compared to the state-of-the-art methods.
