IMO: Greedy Layer-Wise Sparse Representation Learning for Out-of-Distribution Text Classification with Pre-trained Models
Tao Feng, Lizhen Qu, Zhuang Li, Haolan Zhan, Yuncheng Hua, Gholamreza Haffari
TL;DR
This paper tackles single-source domain generalization for text classification by learning invariant representations from pre-trained transformers. It introduces IMO, a greedy layer-wise approach that learns sparse, domain-invariant feature masks and couples them with token-level attention to focus on predictive tokens. Theoretical analysis links invariant representations to causal features and empirically shows IMO outperforms strong baselines, including several open LLMs, across sentiment and topic/social-factor tasks, while showing resilience to data scarcity and providing insights via ablations and feature analyses. The work offers a practical pathway to robust OOD text classification with pre-trained encoders, highlighting the importance of top-down sparse representations and attention in mitigating spurious correlations.
Abstract
Machine learning models have made incredible progress, but they still struggle when applied to examples from unseen domains. This study focuses on a specific problem of domain generalization, where a model is trained on one source domain and tested on multiple target domains that are unseen during training. We propose IMO: Invariant features Masks for Out-of-Distribution text classification, to achieve OOD generalization by learning invariant features. During training, IMO would learn sparse mask layers to remove irrelevant features for prediction, where the remaining features keep invariant. Additionally, IMO has an attention module at the token level to focus on tokens that are useful for prediction. Our comprehensive experiments show that IMO substantially outperforms strong baselines in terms of various evaluation metrics and settings.
