iNeMo: Incremental Neural Mesh Models for Robust Class-Incremental Learning
Tom Fischer, Yaoyao Liu, Artur Jesslen, Noor Ahmed, Prakhar Kaushik, Angtian Wang, Alan Yuille, Adam Kortylewski, Eddy Ilg
TL;DR
Extending continual learning to robust out-of-distribution scenarios, this paper introduces iNeMo, Incremental Neural Mesh Models, which grow a library of 3D cuboid meshes and use a memory of the previous backbone plus a replay buffer. The method employs latent-space initialization via Equiangular Tight Frame partitioning and a positional regularization to keep class features in fixed regions, along with continual training losses and knowledge distillation to prevent forgetting. Empirically, iNeMo outperforms strong 2D baselines by 2–6% in-domain and 6–50% in OOD on Pascal3D+ and ObjectNet3D, and achieves the first incremental pose estimation results. The work demonstrates the practical value of 3D object-centric representations for robust class-incremental learning and paves the way for joint 3D perception under evolving class inventories.
Abstract
Different from human nature, it is still common practice today for vision tasks to train deep learning models only initially and on fixed datasets. A variety of approaches have recently addressed handling continual data streams. However, extending these methods to manage out-of-distribution (OOD) scenarios has not effectively been investigated. On the other hand, it has recently been shown that non-continual neural mesh models exhibit strong performance in generalizing to such OOD scenarios. To leverage this decisive property in a continual learning setting, we propose incremental neural mesh models that can be extended with new meshes over time. In addition, we present a latent space initialization strategy that enables us to allocate feature space for future unseen classes in advance and a positional regularization term that forces the features of the different classes to consistently stay in respective latent space regions. We demonstrate the effectiveness of our method through extensive experiments on the Pascal3D and ObjectNet3D datasets and show that our approach outperforms the baselines for classification by $2-6\%$ in the in-domain and by $6-50\%$ in the OOD setting. Our work also presents the first incremental learning approach for pose estimation. Our code and model can be found at https://github.com/Fischer-Tom/iNeMo.
