Table of Contents
Fetching ...

Compensating Visual Insufficiency with Stratified Language Guidance for Long-Tail Class Incremental Learning

Xi Wang, Xu Yang, Donghao Sun, Cheng Deng

Abstract

Long-tail class incremental learning (LT CIL) remains highly challenging because the scarcity of samples in tail classes not only hampers their learning but also exacerbates catastrophic forgetting under continuously evolving and imbalanced data distributions. To tackle these issues, we exploit the informativeness and scalability of language knowledge. Specifically, we analyze the LT CIL data distribution to guide large language models (LLMs) in generating a stratified language tree that hierarchically organizes semantic information from coarse to fine grained granularity. Building upon this structure, we introduce stratified adaptive language guidance, which leverages learnable weights to merge multi-scale semantic representations, thereby enabling dynamic supervisory adjustment for tail classes and alleviating the impact of data imbalance. Furthermore, we introduce stratified alignment language guidance, which exploits the structural stability of the language tree to constrain optimization and reinforce semantic visual alignment, thereby alleviating catastrophic forgetting. Extensive experiments on multiple benchmarks demonstrate that our method achieves state of the art performance.

Compensating Visual Insufficiency with Stratified Language Guidance for Long-Tail Class Incremental Learning

Abstract

Long-tail class incremental learning (LT CIL) remains highly challenging because the scarcity of samples in tail classes not only hampers their learning but also exacerbates catastrophic forgetting under continuously evolving and imbalanced data distributions. To tackle these issues, we exploit the informativeness and scalability of language knowledge. Specifically, we analyze the LT CIL data distribution to guide large language models (LLMs) in generating a stratified language tree that hierarchically organizes semantic information from coarse to fine grained granularity. Building upon this structure, we introduce stratified adaptive language guidance, which leverages learnable weights to merge multi-scale semantic representations, thereby enabling dynamic supervisory adjustment for tail classes and alleviating the impact of data imbalance. Furthermore, we introduce stratified alignment language guidance, which exploits the structural stability of the language tree to constrain optimization and reinforce semantic visual alignment, thereby alleviating catastrophic forgetting. Extensive experiments on multiple benchmarks demonstrate that our method achieves state of the art performance.
Paper Structure (32 sections, 21 equations, 12 figures, 8 tables, 1 algorithm)

This paper contains 32 sections, 21 equations, 12 figures, 8 tables, 1 algorithm.

Figures (12)

  • Figure 1: Experiments were conducted on CIFAR100, imbalance rate $\rho=0.01$ and 10 tasks. We evaluated accuracy across head classes, tail classes, and the complete dataset to assess the influence of semantic information on LT-CIL.
  • Figure 2: The overall framework of our method. A is the stratified language tree generation, B is the stratified adaptive language guidance, and C is the stratified alignment language guidance.
  • Figure 3: Illustration of Prompt Template 2 and SL-Tree.
  • Figure 4: Tail classes accuracy of different tasks on ImageNet-R, $\rho=0.01$ after 10 tasks.
  • Figure 5: a) shows robustness of different LLMs and b) shows robustness of different prompt templates. All experiments were conducted on ImageNet-R with $\rho=0.01$ after 10 tasks.
  • ...and 7 more figures