A Second-Order Perspective on Model Compositionality and Incremental Learning
Angelo Porrello, Lorenzo Bonicelli, Pietro Buzzega, Monica Millunzi, Simone Calderara, Rita Cucchiara
TL;DR
This work addresses how to achieve reliable compositionality among independently fine-tuned modules in non-linear deep networks. It introduces a second-order Taylor analysis of the loss around pre-training weights $\bm{\theta}_0$ and develops two incremental training strategies, Incremental Task Arithmetic (ITA) and Incremental Ensemble Learning (IEL), to realize modular composition. The authors derive a Jensen-type bound linking the composed model's risk to the risks of individual modules, and propose diagonal-Fisher-based regularization and a Fisher-based ensemble term to regularize training and preserve pre-training knowledge. Empirically, ITA and IEL achieve state-of-the-art or competitive final accuracy across diverse class-incremental benchmarks, while enabling specialization and unlearning with efficient inference, highlighting a practical pathway to composable, lifelong vision models.
Abstract
The fine-tuning of deep pre-trained models has revealed compositional properties, with multiple specialized modules that can be arbitrarily composed into a single, multi-task model. However, identifying the conditions that promote compositionality remains an open issue, with recent efforts concentrating mainly on linearized networks. We conduct a theoretical study that attempts to demystify compositionality in standard non-linear networks through the second-order Taylor approximation of the loss function. The proposed formulation highlights the importance of staying within the pre-training basin to achieve composable modules. Moreover, it provides the basis for two dual incremental training algorithms: the one from the perspective of multiple models trained individually, while the other aims to optimize the composed model as a whole. We probe their application in incremental classification tasks and highlight some valuable skills. In fact, the pool of incrementally learned modules not only supports the creation of an effective multi-task model but also enables unlearning and specialization in certain tasks. Code available at https://github.com/aimagelab/mammoth.
