State-Conditional Adversarial Learning: An Off-Policy Visual Domain Transfer Method for End-to-End Imitation Learning
Yuxiang Liu, Shengfan Cao
TL;DR
This work addresses the challenge of transferring vision-based end-to-end imitation policies across visual domains when target-domain data are off-policy, scarce, and expert-free. It introduces State-Conditional Adversarial Learning (SCAL), an off-policy transfer framework that aligns latent representations conditioned on the system state by estimating a state-conditioned KL divergence via a discriminator. The approach provides a theoretical bound linking target imitation loss to source-domain loss plus latent alignment terms, and demonstrates strong sample efficiency and transfer robustness in BARC–CARLA driving tasks. The results suggest SCAL is a practical, data-efficient method for visual domain transfer in safety-critical settings, with potential for real-world deployment and further theoretical tightening across divergences and domains.
Abstract
We study visual domain transfer for end-to-end imitation learning in a realistic and challenging setting where target-domain data are strictly off-policy, expert-free, and scarce. We first provide a theoretical analysis showing that the target-domain imitation loss can be upper bounded by the source-domain loss plus a state-conditional latent KL divergence between source and target observation models. Guided by this result, we propose State- Conditional Adversarial Learning, an off-policy adversarial framework that aligns latent distributions conditioned on system state using a discriminator-based estimator of the conditional KL term. Experiments on visually diverse autonomous driving environments built on the BARC-CARLA simulator demonstrate that SCAL achieves robust transfer and strong sample efficiency.
