A Survey of Imitation Learning: Algorithms, Recent Developments, and Challenges

Maryam Zare; Parham M. Kebria; Abbas Khosravi; Saeid Nahavandi

A Survey of Imitation Learning: Algorithms, Recent Developments, and Challenges

Maryam Zare, Parham M. Kebria, Abbas Khosravi, Saeid Nahavandi

TL;DR

An introduction to imitation learning (IL), a process where desired behavior is learned by imitating an expert’s behavior, which is provided through demonstrations, and an overview of its underlying assumptions and approaches are provided.

Abstract

In recent years, the development of robotics and artificial intelligence (AI) systems has been nothing short of remarkable. As these systems continue to evolve, they are being utilized in increasingly complex and unstructured environments, such as autonomous driving, aerial robotics, and natural language processing. As a consequence, programming their behaviors manually or defining their behavior through reward functions (as done in reinforcement learning (RL)) has become exceedingly difficult. This is because such environments require a high degree of flexibility and adaptability, making it challenging to specify an optimal set of rules or reward signals that can account for all possible situations. In such environments, learning from an expert's behavior through imitation is often more appealing. This is where imitation learning (IL) comes into play - a process where desired behavior is learned by imitating an expert's behavior, which is provided through demonstrations. This paper aims to provide an introduction to IL and an overview of its underlying assumptions and approaches. It also offers a detailed description of recent advances and emerging areas of research in the field. Additionally, the paper discusses how researchers have addressed common challenges associated with IL and provides potential directions for future research. Overall, the goal of the paper is to provide a comprehensive guide to the growing field of IL in robotics and AI.

A Survey of Imitation Learning: Algorithms, Recent Developments, and Challenges

TL;DR

Abstract

Paper Structure (9 sections, 9 figures, 2 tables)

This paper contains 9 sections, 9 figures, 2 tables.

Introduction
Behavioral Cloning
Inverse Reinforcement Learning
Adversarial Imitation Learning
Imitation from Observation
Challenges and Limitations
Imperfect Demonstrations
Domain Discrepancies
Opportunities and Future Work

Figures (9)

Figure 1: A historical timeline of IL research illustrating key achievements in the field.
Figure 2: A categorization of methods addressing the covariate shift problem. Interactive IL assumes access to an online expert. DAgger like algorithms require the expert to provide corrective labels for each action taken by the agent. On the other hand, human-gated and robot-gated methods provide corrective labels only when they are requested by the expert or agent, respectively. Unlike interactive IL, IRL methods do not require access to an online expert. These methods require an underlying RL algoirthm to optimize a reward function (either learned from demonstrations or fixed). Lastly, by constraining the agent to known regions of the space covered by demonstrations, constrained IL attempts to address covariate shift problems that cannot be expressed or solved using the other two categories.
Figure 3: Left: BC might learn a shortcut from prior observations that outputs the previous action as the current action. Right: a copycat-free memory extraction module. The shortcut is no longer available using historical information chuang2022resolving.
Figure 4: A low-dimensional state feature embedding is pre-trained using ranked demonstrations brown2020safe. A linear combination of learned features is used to derive the reward function. A pairwise ranking likelihood is used by MCMC proposal evaluations to estimate the likelihood of preferences over demonstrations given a proposal (w). Utilizing the pre-computed embeddings of the ranked demonstrations makes MCMC sampling highly efficient; There is no need for data collection during inference or an MDP solver.
Figure 5: A context translation model is trained on several videos of expert demonstrations liu2018imitation. The robot observes the context of the task it must perform during the learning process. The model then determines what an expert would do in the context of the robot.
...and 4 more figures

A Survey of Imitation Learning: Algorithms, Recent Developments, and Challenges

TL;DR

Abstract

A Survey of Imitation Learning: Algorithms, Recent Developments, and Challenges

Authors

TL;DR

Abstract

Table of Contents

Figures (9)