Table of Contents
Fetching ...

Constructing Enhanced Mutual Information for Online Class-Incremental Learning

Huan Zhang, Fan Lyu, Shenghua Fan, Yujin Zheng, Dingwen Wang

TL;DR

The paper tackles OCIL by introducing Enhanced Mutual Information (EMI), a framework that decouples knowledge into Diversity, Representativeness, and Separability. Built on a DualNet (Slow Net for generalized features and Fast Net for class-specific features), EMI constructs three MI terms—DMI, RMI, and SMI—to promote diversified, representative, and separable knowledge across a single-pass data stream with replay. By combining slow-feature diversification with prototype-based representativeness and separability constraints, EMI achieves more uniform intra-class distributions and clearer inter-class boundaries, yielding state-of-the-art performance on CIFAR-10, CIFAR-100, and Tiny-ImageNet in OCIL settings. The results demonstrate improved stability-plasticity balance, with ablations confirming the effectiveness of each component and analyses highlighting practical gains and some speed considerations for online deployment.

Abstract

Online Class-Incremental continual Learning (OCIL) addresses the challenge of continuously learning from a single-channel data stream, adapting to new tasks while mitigating catastrophic forgetting. Recently, Mutual Information (MI)-based methods have shown promising performance in OCIL. However, existing MI-based methods treat various knowledge components in isolation, ignoring the knowledge confusion across tasks. This narrow focus on simple MI knowledge alignment may lead to old tasks being easily forgotten with the introduction of new tasks, risking the loss of common parts between past and present knowledge.To address this, we analyze the MI relationships from the perspectives of diversity, representativeness, and separability, and propose an Enhanced Mutual Information (EMI) method based on knwoledge decoupling. EMI consists of Diversity Mutual Information (DMI), Representativeness Mutual Information (RMI) and Separability Mutual Information (SMI). DMI diversifies intra-class sample features by considering the similarity relationships among inter-class sample features to enable the network to learn more general knowledge. RMI summarizes representative features for each category and aligns sample features with these representative features, making the intra-class sample distribution more compact. SMI establishes MI relationships for inter-class representative features, enhancing the stability of representative features while increasing the distinction between inter-class representative features, thus creating clear boundaries between class. Extensive experimental results on widely used benchmark datasets demonstrate the superior performance of EMI over state-of-the-art baseline methods.

Constructing Enhanced Mutual Information for Online Class-Incremental Learning

TL;DR

The paper tackles OCIL by introducing Enhanced Mutual Information (EMI), a framework that decouples knowledge into Diversity, Representativeness, and Separability. Built on a DualNet (Slow Net for generalized features and Fast Net for class-specific features), EMI constructs three MI terms—DMI, RMI, and SMI—to promote diversified, representative, and separable knowledge across a single-pass data stream with replay. By combining slow-feature diversification with prototype-based representativeness and separability constraints, EMI achieves more uniform intra-class distributions and clearer inter-class boundaries, yielding state-of-the-art performance on CIFAR-10, CIFAR-100, and Tiny-ImageNet in OCIL settings. The results demonstrate improved stability-plasticity balance, with ablations confirming the effectiveness of each component and analyses highlighting practical gains and some speed considerations for online deployment.

Abstract

Online Class-Incremental continual Learning (OCIL) addresses the challenge of continuously learning from a single-channel data stream, adapting to new tasks while mitigating catastrophic forgetting. Recently, Mutual Information (MI)-based methods have shown promising performance in OCIL. However, existing MI-based methods treat various knowledge components in isolation, ignoring the knowledge confusion across tasks. This narrow focus on simple MI knowledge alignment may lead to old tasks being easily forgotten with the introduction of new tasks, risking the loss of common parts between past and present knowledge.To address this, we analyze the MI relationships from the perspectives of diversity, representativeness, and separability, and propose an Enhanced Mutual Information (EMI) method based on knwoledge decoupling. EMI consists of Diversity Mutual Information (DMI), Representativeness Mutual Information (RMI) and Separability Mutual Information (SMI). DMI diversifies intra-class sample features by considering the similarity relationships among inter-class sample features to enable the network to learn more general knowledge. RMI summarizes representative features for each category and aligns sample features with these representative features, making the intra-class sample distribution more compact. SMI establishes MI relationships for inter-class representative features, enhancing the stability of representative features while increasing the distinction between inter-class representative features, thus creating clear boundaries between class. Extensive experimental results on widely used benchmark datasets demonstrate the superior performance of EMI over state-of-the-art baseline methods.
Paper Structure (21 sections, 28 equations, 5 figures, 6 tables, 1 algorithm)

This paper contains 21 sections, 28 equations, 5 figures, 6 tables, 1 algorithm.

Figures (5)

  • Figure 1: EMI considers the diversity, representativeness, and separability of samples. By enhancing the network's learning ability in OCIL from these three perspectives, EMI achieves better coupling of new and old knowledge.
  • Figure 2: The proposed EMI framework leverages the complementary capabilities of DualNet. Initially, the incoming data stream and replay data are fed into the Slow Net to learn slow features. DMI utilizes these slow features to construct diversity-enhanced MI relationships. Subsequently, the slow features are transformed into fast features by the Fast Net. RMI and SMI leverage these fast features and prototypes to construct representativeness-enhanced and separability-enhanced MI, respectively.
  • Figure 3: Incremental accuracy on tasks observed so far in the test set of CIFAR-10 and CIFAR-100 with different buffer sizes.
  • Figure 4: We plot feature distributions with Gaussian kernel density estimation (KDE) in $\mathbb{R}^2$ and visualizes the distributions on a unit circle. Three rightmost plots visualize feature distributions of selected specific tasks. The representation from EMI is evenly distributed, enabling it to learn a more uniform feature representation.
  • Figure 5: t-SNE visualizations of features learned on the test set of CIFAR-10. Compared to other methods, the intra-class distribution of EMI is more compact, and the boundaries between classes are clearer.