DivIL: Unveiling and Addressing Over-Invariance for Out-of- Distribution Generalization
Jiaqi Wang, Yuhang Zhou, Zhixiong Zhang, Qiguang Chen, Yongqiang Chen, James Cheng
TL;DR
The paper identifies a fundamental limitation of invariant learning: over-invariance, where overly strong invariance constraints degrade important details and hurt OOD generalization. To address this, it introduces Diverse Invariant Learning (DivIL), which couples invariant penalties with unsupervised contrastive learning and a novel random masking strategy to diversify learned invariances. The authors formalize over-invariance, provide synthetic and real-data evidence, and validate DivIL across graphs, CMNIST, and natural language inference tasks with multiple backbones and data augmentations. Results show DivIL consistently improves OOD generalization over standard IL baselines, offering a practical, modality-spanning approach to robust distribution generalization. The work provides both theoretical insight and a scalable framework for enhancing invariant learning in real-world settings.
Abstract
Out-of-distribution generalization is a common problem that expects the model to perform well in the different distributions even far from the train data. A popular approach to addressing this issue is invariant learning (IL), in which the model is compiled to focus on invariant features instead of spurious features by adding strong constraints during training. However, there are some potential pitfalls of strong invariant constraints. Due to the limited number of diverse environments and over-regularization in the feature space, it may lead to a loss of important details in the invariant features while alleviating the spurious correlations, namely the over-invariance, which can also degrade the generalization performance. We theoretically define the over-invariance and observe that this issue occurs in various classic IL methods. To alleviate this issue, we propose a simple approach Diverse Invariant Learning (DivIL) by adding the unsupervised contrastive learning and the random masking mechanism compensatory for the invariant constraints, which can be applied to various IL methods. Furthermore, we conduct experiments across multiple modalities across 12 datasets and 6 classic models, verifying our over-invariance insight and the effectiveness of our DivIL framework. Our code is available at https://github.com/kokolerk/DivIL.
