Table of Contents
Fetching ...

Causally Sufficient and Necessary Feature Expansion for Class-Incremental Learning

Zhen Zhang, Jielei Chu, Tianrui Li

TL;DR

A Probability of Necessity and Sufficiency (PNS)-based regularization method to guide feature expansion in CIL, which quantifies both the causal completeness of intra-task representations and the separability of inter-task representations.

Abstract

Current expansion-based methods for Class Incremental Learning (CIL) effectively mitigate catastrophic forgetting by freezing old features. However, such task-specific features learned from the new task may collide with the old features. From a causal perspective, spurious feature correlations are the main cause of this collision, manifesting in two scopes: (i) guided by empirical risk minimization (ERM), intra-task spurious correlations cause task-specific features to rely on shortcut features. These non-robust features are vulnerable to interference, inevitably drifting into the feature space of other tasks; (ii) inter-task spurious correlations induce semantic confusion between visually similar classes across tasks. To address this, we propose a Probability of Necessity and Sufficiency (PNS)-based regularization method to guide feature expansion in CIL. Specifically, we first extend the definition of PNS to expansion-based CIL, termed CPNS, which quantifies both the causal completeness of intra-task representations and the separability of inter-task representations. We then introduce a dual-scope counterfactual generator based on twin networks to ensure the measurement of CPNS, which simultaneously generates: (i) intra-task counterfactual features to minimize intra-task PNS risk and ensure causal completeness of task-specific features, and (ii) inter-task interfering features to minimize inter-task PNS risk, ensuring the separability of inter-task representations. Theoretical analyses confirm its reliability. The regularization is a plug-and-play method for expansion-based CIL to mitigate feature collision. Extensive experiments demonstrate the effectiveness of the proposed method.

Causally Sufficient and Necessary Feature Expansion for Class-Incremental Learning

TL;DR

A Probability of Necessity and Sufficiency (PNS)-based regularization method to guide feature expansion in CIL, which quantifies both the causal completeness of intra-task representations and the separability of inter-task representations.

Abstract

Current expansion-based methods for Class Incremental Learning (CIL) effectively mitigate catastrophic forgetting by freezing old features. However, such task-specific features learned from the new task may collide with the old features. From a causal perspective, spurious feature correlations are the main cause of this collision, manifesting in two scopes: (i) guided by empirical risk minimization (ERM), intra-task spurious correlations cause task-specific features to rely on shortcut features. These non-robust features are vulnerable to interference, inevitably drifting into the feature space of other tasks; (ii) inter-task spurious correlations induce semantic confusion between visually similar classes across tasks. To address this, we propose a Probability of Necessity and Sufficiency (PNS)-based regularization method to guide feature expansion in CIL. Specifically, we first extend the definition of PNS to expansion-based CIL, termed CPNS, which quantifies both the causal completeness of intra-task representations and the separability of inter-task representations. We then introduce a dual-scope counterfactual generator based on twin networks to ensure the measurement of CPNS, which simultaneously generates: (i) intra-task counterfactual features to minimize intra-task PNS risk and ensure causal completeness of task-specific features, and (ii) inter-task interfering features to minimize inter-task PNS risk, ensuring the separability of inter-task representations. Theoretical analyses confirm its reliability. The regularization is a plug-and-play method for expansion-based CIL to mitigate feature collision. Extensive experiments demonstrate the effectiveness of the proposed method.
Paper Structure (32 sections, 5 theorems, 42 equations, 7 figures, 7 tables, 1 algorithm)

This paper contains 32 sections, 5 theorems, 42 equations, 7 figures, 7 tables, 1 algorithm.

Key Result

Theorem 3.3

Given the monotonicity assumption, the CPNS is identifiable and defined by the difference in interventional distributions, robust to latent confounding: First, the identifiability of intra-task PNS: Second, the identifiability of inter-task PNS: where $do(\cdot)$ denotes the causal intervention operator.

Figures (7)

  • Figure 1: (a) Illustration of feature suppression and collision. ERM and diversity strategy drives the model to learn shortcut features (e.g., ear vs. eyes) for semantically similar classes, leading to a fragmented feature space. (b) CKA feature similarity analysis. Our method possesses high similarity in shallow layers (indicating shared causal semantics) while maintaining discriminability in deep layers.
  • Figure 2: Structural Causal Model (SCM) for expansion-based CIL. Left: causal generating mechanism, Right: the learning process.
  • Figure 3: Accuracy curves for CPNS on various scenarios and baselines.
  • Figure 4: Hyperparameter sensitivity analysis on CI FAR-100 10-10.
  • Figure 5: Training time and accuracy of different methods in the CIFAR-100 10-10 scenario.
  • ...and 2 more figures

Theorems & Definitions (12)

  • Definition 3.1: Probability of Necessity and Sufficiency (PNS) pearl2009causalityyang2023invariant
  • Definition 3.2: Probability of Necessity and Sufficiency in Expansion-based CIL (CPNS)
  • Theorem 3.3: Causal Identifiability of CPNS under Monotonicity
  • Definition 3.4: Double-range counterfactual modeling
  • Theorem 3.5: CPNS risk
  • Theorem 1.1: Causal Identifiability of CPNS under Monotonicity
  • proof
  • Proposition 1.2
  • proof
  • Definition 1.3: Binary Simplification for Inter-task Monotonicity
  • ...and 2 more