DIGIC: Domain Generalizable Imitation Learning by Causal Discovery
Yang Chen, Yitao Liang, Zhouchen Lin
TL;DR
The paper tackles domain generalization in imitation learning by proposing DIGIC, a two-stage framework that first discovers the direct causes of the expert action from the demonstration data distribution using a causal-discovery module, then trains an imitation policy on these causal features. By conditioning on the direct causes, the BC policy achieves domain generalization across unseen environments without requiring multi-domain data, and the method can complement cross-domain variation-based approaches under mild non-structural assumptions. The authors implement a learning-based generalized inverse-covariance approach to identify causal features and validate DIGIC on OpenAI Gym control tasks, where it shows strong performance in shifted domains and improves invariant-spurious-feature robustness when paired with multi-domain methods like IRM. Overall, DIGIC provides a practical and flexible pathway to robust imitation policies grounded in causal structure derived from demonstrations, reducing reliance on cross-domain data and expanding the applicability of domain-generalization in imitation learning.
Abstract
Causality has been combined with machine learning to produce robust representations for domain generalization. Most existing methods of this type require massive data from multiple domains to identify causal features by cross-domain variations, which can be expensive or even infeasible and may lead to misidentification in some cases. In this work, we make a different attempt by leveraging the demonstration data distribution to discover the causal features for a domain generalizable policy. We design a novel framework, called DIGIC, to identify the causal features by finding the direct cause of the expert action from the demonstration data distribution via causal discovery. Our framework can achieve domain generalizable imitation learning with only single-domain data and serve as a complement for cross-domain variation-based methods under non-structural assumptions on the underlying causal models. Our empirical study in various control tasks shows that the proposed framework evidently improves the domain generalization performance and has comparable performance to the expert in the original domain simultaneously.
