Table of Contents
Fetching ...

IDIL: Imitation Learning of Intent-Driven Expert Behavior

Sangwon Seo, Vaibhav Unhelkar

TL;DR

IDIL tackles imitation learning when experts act under time-varying intents by modeling behavior with an Agent Markov Model and learning an intent-aware policy and intent dynamics. It introduces an EM-like, non-adversarial algorithm that factorizes the occupancy-measure objective into two tractable subproblems solved with IQ-Learn, enabling stable learning in high-dimensional or continuous domains. Theoretical results provide convergence under reasonable approximations, and experiments across MG-n, OneMover, Movers, and Mujoco show IDIL outperforming baselines in intent-driven tasks and delivering interpretable, diverse behaviors along with accurate intent inference. The method offers practical benefits for human-agent collaboration and scalable modeling of heterogeneous expert behavior, with clear directions for extension to larger intent sets and multi-agent scenarios.

Abstract

When faced with accomplishing a task, human experts exhibit intentional behavior. Their unique intents shape their plans and decisions, resulting in experts demonstrating diverse behaviors to accomplish the same task. Due to the uncertainties encountered in the real world and their bounded rationality, experts sometimes adjust their intents, which in turn influences their behaviors during task execution. This paper introduces IDIL, a novel imitation learning algorithm to mimic these diverse intent-driven behaviors of experts. Iteratively, our approach estimates expert intent from heterogeneous demonstrations and then uses it to learn an intent-aware model of their behavior. Unlike contemporary approaches, IDIL is capable of addressing sequential tasks with high-dimensional state representations, while sidestepping the complexities and drawbacks associated with adversarial training (a mainstay of related techniques). Our empirical results suggest that the models generated by IDIL either match or surpass those produced by recent imitation learning benchmarks in metrics of task performance. Moreover, as it creates a generative model, IDIL demonstrates superior performance in intent inference metrics, crucial for human-agent interactions, and aptly captures a broad spectrum of expert behaviors.

IDIL: Imitation Learning of Intent-Driven Expert Behavior

TL;DR

IDIL tackles imitation learning when experts act under time-varying intents by modeling behavior with an Agent Markov Model and learning an intent-aware policy and intent dynamics. It introduces an EM-like, non-adversarial algorithm that factorizes the occupancy-measure objective into two tractable subproblems solved with IQ-Learn, enabling stable learning in high-dimensional or continuous domains. Theoretical results provide convergence under reasonable approximations, and experiments across MG-n, OneMover, Movers, and Mujoco show IDIL outperforming baselines in intent-driven tasks and delivering interpretable, diverse behaviors along with accurate intent inference. The method offers practical benefits for human-agent collaboration and scalable modeling of heterogeneous expert behavior, with clear directions for extension to larger intent sets and multi-agent scenarios.

Abstract

When faced with accomplishing a task, human experts exhibit intentional behavior. Their unique intents shape their plans and decisions, resulting in experts demonstrating diverse behaviors to accomplish the same task. Due to the uncertainties encountered in the real world and their bounded rationality, experts sometimes adjust their intents, which in turn influences their behaviors during task execution. This paper introduces IDIL, a novel imitation learning algorithm to mimic these diverse intent-driven behaviors of experts. Iteratively, our approach estimates expert intent from heterogeneous demonstrations and then uses it to learn an intent-aware model of their behavior. Unlike contemporary approaches, IDIL is capable of addressing sequential tasks with high-dimensional state representations, while sidestepping the complexities and drawbacks associated with adversarial training (a mainstay of related techniques). Our empirical results suggest that the models generated by IDIL either match or surpass those produced by recent imitation learning benchmarks in metrics of task performance. Moreover, as it creates a generative model, IDIL demonstrates superior performance in intent inference metrics, crucial for human-agent interactions, and aptly captures a broad spectrum of expert behaviors.
Paper Structure (51 sections, 4 theorems, 42 equations, 6 figures, 4 tables, 1 algorithm)

This paper contains 51 sections, 4 theorems, 42 equations, 6 figures, 4 tables, 1 algorithm.

Key Result

Proposition 2.1

Consider two arbitrary AMM models: $\mathcal{N}$ and $\mathcal{N}'$. Then, $\rho_{\mathcal{N}}(s, a, x) = \rho_{\mathcal{N}'}(s, a, x)$ and $\rho_{\mathcal{N}}(s, x, x^-) = \rho_{\mathcal{N}'}(s, x, x^-)$ if and only if $\rho_{\mathcal{N}}(s, a, x, x^-)=\rho_{\mathcal{N}'}(s, a, x, x^-)$.

Figures (6)

  • Figure 1: Consider the task of emptying a table. Different individuals may accomplish this task differently; some starting with the red block, while others with the green or blue. Intent-driven imitation learning aims to model this diversity in behaviors (arising from differences in experts' intents) from heterogeneous demonstrations.
  • Figure 2: Dynamic Bayesian network representing Intent-Driven Behavior. Shaded nodes denote known or observable variables; other variables are latent.
  • Figure 3: Visual illustrations of Experimental Domains.
  • Figure 4: MultiGoals-$3$ trajectories generated by the expert and learnt models according to different intents.
  • Figure 5: MultiGoals domains that include different number of intents.
  • ...and 1 more figures

Theorems & Definitions (4)

  • Proposition 2.1
  • Lemma 2.2
  • Lemma 2.3
  • Theorem 2.4