CGLearn: Consistent Gradient-Based Learning for Out-of-Distribution Generalization

Jawad Chowdhury; Gabriel Terejanu

CGLearn: Consistent Gradient-Based Learning for Out-of-Distribution Generalization

Jawad Chowdhury, Gabriel Terejanu

TL;DR

This work introduces a simple yet powerful approach, CGLearn, which relies on the agreement of gradients across various environments, serving as a powerful indication of reliable features, while disagreement suggests less reliability due to potential differences in underlying causal mechanisms.

Abstract

Improving generalization and achieving highly predictive, robust machine learning models necessitates learning the underlying causal structure of the variables of interest. A prominent and effective method for this is learning invariant predictors across multiple environments. In this work, we introduce a simple yet powerful approach, CGLearn, which relies on the agreement of gradients across various environments. This agreement serves as a powerful indication of reliable features, while disagreement suggests less reliability due to potential differences in underlying causal mechanisms. Our proposed method demonstrates superior performance compared to state-of-the-art methods in both linear and nonlinear settings across various regression and classification tasks. CGLearn shows robust applicability even in the absence of separate environments by exploiting invariance across different subsamples of observational data. Comprehensive experiments on both synthetic and real-world datasets highlight its effectiveness in diverse scenarios. Our findings underscore the importance of leveraging gradient agreement for learning causal invariance, providing a significant step forward in the field of robust machine learning. The source code of the linear and nonlinear implementation of CGLearn is open-source and available at: https://github.com/hasanjawad001/CGLearn.

CGLearn: Consistent Gradient-Based Learning for Out-of-Distribution Generalization

TL;DR

Abstract

CGLearn: Consistent Gradient-Based Learning for Out-of-Distribution Generalization

TL;DR

Abstract

Paper Structure

Table of Contents

Figures (3)