Diffusion-Classifier Synergy: Reward-Aligned Learning via Mutual Boosting Loop for FSCIL

Ruitao Wu; Yifan Zhao; Guangyao Chen; Jia Li

Diffusion-Classifier Synergy: Reward-Aligned Learning via Mutual Boosting Loop for FSCIL

Ruitao Wu, Yifan Zhao, Guangyao Chen, Jia Li

TL;DR

This work tackles Few-Shot Class-Incremental Learning by coupling diffusion-based data generation with an FSCIL classifier through a reward-guided mutual boosting loop. By splitting rewards into feature-level (semantic coherence and diversity via PAMMD and VM) and logits-level (classifier-aware generation via RC and CSCA), the framework iteratively improves both the diffusion outputs and the classifier’s discrimination across sessions. The proposed approach achieves state-of-the-art results on standard FSCIL benchmarks, demonstrating strong knowledge retention for old classes while efficiently incorporating new ones from limited data. This co-evolutionary strategy highlights how incorporating classifier feedback into diffusion guidance can substantially enhance data-efficient continual learning with practical implications for real-world, dynamic environments.

Abstract

Few-Shot Class-Incremental Learning (FSCIL) challenges models to sequentially learn new classes from minimal examples without forgetting prior knowledge, a task complicated by the stability-plasticity dilemma and data scarcity. Current FSCIL methods often struggle with generalization due to their reliance on limited datasets. While diffusion models offer a path for data augmentation, their direct application can lead to semantic misalignment or ineffective guidance. This paper introduces Diffusion-Classifier Synergy (DCS), a novel framework that establishes a mutual boosting loop between diffusion model and FSCIL classifier. DCS utilizes a reward-aligned learning strategy, where a dynamic, multi-faceted reward function derived from the classifier's state directs the diffusion model. This reward system operates at two levels: the feature level ensures semantic coherence and diversity using prototype-anchored maximum mean discrepancy and dimension-wise variance matching, while the logits level promotes exploratory image generation and enhances inter-class discriminability through confidence recalibration and cross-session confusion-aware mechanisms. This co-evolutionary process, where generated images refine the classifier and an improved classifier state yields better reward signals, demonstrably achieves state-of-the-art performance on FSCIL benchmarks, significantly enhancing both knowledge retention and new class learning.

Diffusion-Classifier Synergy: Reward-Aligned Learning via Mutual Boosting Loop for FSCIL

TL;DR

Abstract

Diffusion-Classifier Synergy: Reward-Aligned Learning via Mutual Boosting Loop for FSCIL

TL;DR

Abstract

Paper Structure

Table of Contents

Figures (4)