Not Just Object, But State: Compositional Incremental Learning without Forgetting
Yanyi Zhang, Binglin Qiu, Qi Jia, Yu Liu, Ran He
TL;DR
This work introduces Compositional Incremental Learning (composition-IL), a setting where models continually acquire fine-grained state-object knowledge without forgetting. It proposes CompILer, a rehearsal-free learner that uses multi-pool prompts to model states, objects, and their compositions, augmented by object-injected state prompting and generalized-mean prompt fusion. By reorganizing Clothing16K IVR and UT-Zappos50K into Split-Clothing and Split-UT-Zappos, the authors show that CompILer achieves state-of-the-art Avg Acc and favorable HM while maintaining robustness to noisy labels through symmetric cross-entropy. The approach advances fine-grained compositional reasoning in open-ended incremental learning with practical implications for visual understanding of object attributes across time and domains.
Abstract
Most incremental learners excessively prioritize coarse classes of objects while neglecting various kinds of states (e.g. color and material) attached to the objects. As a result, they are limited in the ability to reason fine-grained compositionality of state-object pairs. To remedy this limitation, we propose a novel task called Compositional Incremental Learning (composition-IL), enabling the model to recognize state-object compositions as a whole in an incremental learning fashion. Since the lack of suitable benchmarks, we re-organize two existing datasets and make them tailored for composition-IL. Then, we propose a prompt-based Composition Incremental Learner (CompILer), to overcome the ambiguous composition boundary problem which challenges composition-IL largely. Specifically, we exploit multi-pool prompt learning, which is regularized by inter-pool prompt discrepancy and intra-pool prompt diversity. Besides, we devise object-injected state prompting by using object prompts to guide the selection of state prompts. Furthermore, we fuse the selected prompts by a generalized-mean strategy, to eliminate irrelevant information learned in the prompts. Extensive experiments on two datasets exhibit state-of-the-art performance achieved by CompILer.
