Table of Contents
Fetching ...

Deep Imitative Models for Flexible Inference, Planning, and Control

Nicholas Rhinehart, Rowan McAllister, Sergey Levine

TL;DR

The paper proposes Deep Imitative Models that merge imitation learning with goal directed planning by learning a trajectory density $q(\\mathbf{S}_{1:T}|\\phi)$ from expert demonstrations and performing posterior inference against a flexible goal likelihood $p(\\mathcal{G}|\\mathbf{s},\\phi)$. Planning is formulated as a MAP problem that maximizes $\\log q(\\mathbf{S}|\\phi) + \\log p(\\mathcal{G}|\\mathbf{S},\\phi)$, enabling multi step expert like trajectories toward novel goals. The approach supports multiple goal likelihoods including constrained and unconstrained forms and can incorporate test time costs, achieving state of the art performance in CARLA driving while remaining robust to mis specified goals and unseen obstacles. This offline data efficient method offers interpretable planning, broad applicability to autonomous control tasks, and potential safety benefits in real world deployment.

Abstract

Imitation Learning (IL) is an appealing approach to learn desirable autonomous behavior. However, directing IL to achieve arbitrary goals is difficult. In contrast, planning-based algorithms use dynamics models and reward functions to achieve goals. Yet, reward functions that evoke desirable behavior are often difficult to specify. In this paper, we propose Imitative Models to combine the benefits of IL and goal-directed planning. Imitative Models are probabilistic predictive models of desirable behavior able to plan interpretable expert-like trajectories to achieve specified goals. We derive families of flexible goal objectives, including constrained goal regions, unconstrained goal sets, and energy-based goals. We show that our method can use these objectives to successfully direct behavior. Our method substantially outperforms six IL approaches and a planning-based approach in a dynamic simulated autonomous driving task, and is efficiently learned from expert demonstrations without online data collection. We also show our approach is robust to poorly specified goals, such as goals on the wrong side of the road.

Deep Imitative Models for Flexible Inference, Planning, and Control

TL;DR

The paper proposes Deep Imitative Models that merge imitation learning with goal directed planning by learning a trajectory density from expert demonstrations and performing posterior inference against a flexible goal likelihood . Planning is formulated as a MAP problem that maximizes , enabling multi step expert like trajectories toward novel goals. The approach supports multiple goal likelihoods including constrained and unconstrained forms and can incorporate test time costs, achieving state of the art performance in CARLA driving while remaining robust to mis specified goals and unseen obstacles. This offline data efficient method offers interpretable planning, broad applicability to autonomous control tasks, and potential safety benefits in real world deployment.

Abstract

Imitation Learning (IL) is an appealing approach to learn desirable autonomous behavior. However, directing IL to achieve arbitrary goals is difficult. In contrast, planning-based algorithms use dynamics models and reward functions to achieve goals. Yet, reward functions that evoke desirable behavior are often difficult to specify. In this paper, we propose Imitative Models to combine the benefits of IL and goal-directed planning. Imitative Models are probabilistic predictive models of desirable behavior able to plan interpretable expert-like trajectories to achieve specified goals. We derive families of flexible goal objectives, including constrained goal regions, unconstrained goal sets, and energy-based goals. We show that our method can use these objectives to successfully direct behavior. Our method substantially outperforms six IL approaches and a planning-based approach in a dynamic simulated autonomous driving task, and is efficiently learned from expert demonstrations without online data collection. We also show our approach is robust to poorly specified goals, such as goals on the wrong side of the road.

Paper Structure

This paper contains 35 sections, 15 equations, 15 figures, 5 tables, 3 algorithms.

Figures (15)

  • Figure 1: Our method: deep imitative models. Top Center. We use demonstrations to learn a probability density function $q$ of future behavior and deploy it to accomplish various tasks. Left: A region in the ground plane is input to a planning procedure that reasons about how the expert would achieve that task. It coarsely specifies a destination, and guides the vehicle to turn left. Right: Goal positions and potholes yield a plan that avoids potholes and achieves one of the goals on the right.
  • Figure 2: Imitative planning with the Gaussian State Sequence enables fine-grained control of the plans.
  • Figure 3: Costs can be assigned to "potholes" only seen at test-time. The planner prefers routes avoiding potholes.
  • Figure 4: Goal regions can be coarsely specified to give directions.
  • Figure 5: Architecture of $m_\theta$ and $\sigma_\theta$, which parameterize $q_\theta(\mathbf{S}|\phi\!=\!\{\chi,\mathbf{s}_{-\tau:0},\boldsymbol{\lambda}\})$. Inputs: LIDAR $\chi$, past-states $\mathbf{s}_{-\tau:0}$, light-state $\boldsymbol{\lambda}$, and latent noise $\mathbf{Z}_{1:T}$. Output: trajectory $\mathbf{S}_{1:T}$. Details in Appendix \ref{['app:architecture']}.
  • ...and 10 more figures