Domain Adaptation Through Task Distillation
Brady Zhou, Nimit Kalra, Philipp Krähenbühl
TL;DR
The paper tackles domain shift when transferring learning from simulation to real-world environments, a central challenge for autonomous systems due to scarce real-world labeled data. It introduces task distillation, a two-stage approach that first distills a source model into a proxy using a readily available recognition task and then distills a target model from that proxy to operate in the target domain, formalized as $f^P := \mathcal{D}_{\mathcal{S}}(f^S)$ and $f^T := \mathcal{D}_{\mathcal{T}}(f^P)$. By leveraging ground-truth recognition labels instead of end-task labels in the target domain, the method reduces error propagation common in modular pipelines and avoids reliance on imperfect surrogate systems at deployment time, with accurracies roughly characterized by $a^T_{distill} = a^P \, G^L \, a^d$. Empirically, task distillation enables cross-simulator policy transfer (ViZDoom, SuperTuxKart to CARLA) and improves semantic segmentation transfer (SYNTHIA-SF to CARLA/Cityscapes), outperforming baselines. Overall, the framework demonstrates that solving recognition across all domains is not strictly necessary; a well-chosen proxy task can yield robust, end-to-end models in the target domain, broadening practical applicability of simulation-to-reality transfer.
Abstract
Deep networks devour millions of precisely annotated images to build their complex and powerful representations. Unfortunately, tasks like autonomous driving have virtually no real-world training data. Repeatedly crashing a car into a tree is simply too expensive. The commonly prescribed solution is simple: learn a representation in simulation and transfer it to the real world. However, this transfer is challenging since simulated and real-world visual experiences vary dramatically. Our core observation is that for certain tasks, such as image recognition, datasets are plentiful. They exist in any interesting domain, simulated or real, and are easy to label and extend. We use these recognition datasets to link up a source and target domain to transfer models between them in a task distillation framework. Our method can successfully transfer navigation policies between drastically different simulators: ViZDoom, SuperTuxKart, and CARLA. Furthermore, it shows promising results on standard domain adaptation benchmarks.
