Mitigating Spurious Correlations with Causal Logit Perturbation
Xiaoling Zhou, Wei Ye, Rui Xie, Shikun Zhang
TL;DR
The paper tackles the problem of spurious correlations in deep learning by introducing Causal Logit Perturbation (CLP), a framework that learns sample-specific logit perturbations to suppress non-causal background cues. CLP combines a perturbation network (a two-layer MLP) with metadata augmented via counterfactual and factual examples guided by human causal knowledge, and feeds ten training-dynamics features into the perturbation network within an online meta-learning loop. The approach yields state-of-the-art results across four biased-learning scenarios—long-tailed, noisy labels, subpopulation shifts, and generalized long-tail—while visualization and ablation studies corroborate that the perturbations redirect attention to causal features and disrupt spurious associations. Overall, CLP offers a practical, generalizable mechanism to improve robustness and fairness in vision models by explicitly modeling causal interventions at the logit level.
Abstract
Deep learning has seen widespread success in various domains such as science, industry, and society. However, it is acknowledged that certain approaches suffer from non-robustness, relying on spurious correlations for predictions. Addressing these limitations is of paramount importance, necessitating the development of methods that can disentangle spurious correlations. {This study attempts to implement causal models via logit perturbations and introduces a novel Causal Logit Perturbation (CLP) framework to train classifiers with generated causal logit perturbations for individual samples, thereby mitigating the spurious associations between non-causal attributes (i.e., image backgrounds) and classes.} {Our framework employs a} perturbation network to generate sample-wise logit perturbations using a series of training characteristics of samples as inputs. The whole framework is optimized by an online meta-learning-based learning algorithm and leverages human causal knowledge by augmenting metadata in both counterfactual and factual manners. Empirical evaluations on four typical biased learning scenarios, including long-tail learning, noisy label learning, generalized long-tail learning, and subpopulation shift learning, demonstrate that CLP consistently achieves state-of-the-art performance. Moreover, visualization results support the effectiveness of the generated causal perturbations in redirecting model attention towards causal image attributes and dismantling spurious associations.
