Towards Causal Representation Learning
Bernhard Schölkopf, Francesco Locatello, Stefan Bauer, Nan Rosemary Ke, Nal Kalchbrenner, Anirudh Goyal, Yoshua Bengio
TL;DR
This paper argues that bridging machine learning with causal inference through causal representation learning can address core limitations of current AI, notably robustness and transfer across distribution shifts. It develops a unifying view centered on Independent Causal Mechanisms, the level of causal modeling, and the pursuit of learning causal variables from high-dimensional data using modular, intervention-friendly representations. It outlines concrete research directions in learning disentangled causal representations, transferable mechanisms, interventional world models, and systematic application to SSL, RL, and science. The proposed framework aims to enable more robust generalization, faster transfer, and safer, more interpretable AI by aligning learning with the underlying causal structure of the world.
Abstract
The two fields of machine learning and graphical causality arose and developed separately. However, there is now cross-pollination and increasing interest in both fields to benefit from the advances of the other. In the present paper, we review fundamental concepts of causal inference and relate them to crucial open problems of machine learning, including transfer and generalization, thereby assaying how causality can contribute to modern machine learning research. This also applies in the opposite direction: we note that most work in causality starts from the premise that the causal variables are given. A central problem for AI and causality is, thus, causal representation learning, the discovery of high-level causal variables from low-level observations. Finally, we delineate some implications of causality for machine learning and propose key research areas at the intersection of both communities.
