Robotic Arm Manipulation with Inverse Reinforcement Learning & TD-MPC

Md Shoyib Hassan; Sabir Md Sanaullah

Robotic Arm Manipulation with Inverse Reinforcement Learning & TD-MPC

Md Shoyib Hassan, Sabir Md Sanaullah

TL;DR

The paper tackles the difficulty of scaling model-based IRL to real robotic manipulation by learning cost functions from visual demonstrations and optimizing with a temporal-difference visual MPC framework that employs a keypoint-based latent state and a pre-trained dynamics model. It introduces a gradient-based IRL mechanism that differentiates through the inner optimization to update cost parameters, and supplements this with an adversarial IRL variant using TD-MPC. Key contributions include a compact, keypoint-based visual representation, a latent dynamics model, and a gradient-based bi-level optimization approach for IRL in vision-based manipulation, demonstrated on a simulated Franka Panda task. The work advances sample efficiency and generalization in visual IRL for robotics and highlights practical avenues for robust visual prediction, viewpoint invariance, and potential natural-language command integration to broaden applicability.

Abstract

One unresolved issue is how to scale model-based inverse reinforcement learning (IRL) to actual robotic manipulation tasks with unpredictable dynamics. The ability to learn from both visual and proprioceptive examples, creating algorithms that scale to high-dimensional state-spaces, and mastering strong dynamics models are the main obstacles. In this work, we provide a gradient-based inverse reinforcement learning framework that learns cost functions purely from visual human demonstrations. The shown behavior and the trajectory is then optimized using TD visual model predictive control(MPC) and the learned cost functions. We test our system using fundamental object manipulation tasks on hardware.

Robotic Arm Manipulation with Inverse Reinforcement Learning & TD-MPC

TL;DR

Abstract

Paper Structure (26 sections, 9 equations, 2 figures, 1 algorithm)

This paper contains 26 sections, 9 equations, 2 figures, 1 algorithm.

Introduction
Literature Review
Foundational Approaches in IRL
Probabilistic Frameworks
Deep Learning Techniques
Visual Learning
Meta-Learning Algorithms
Visual Model Predictive Control
Inverse Reinforcement Learning
Gradient-Based Visual Model Predictive Control Framework
Applications and Experimental Validation
Temporal-Difference Visual Model Predictive Control Framework
Keypoints as Visual Latent State and Dynamics Model
Temporal-Difference Learning in Model Predictive Control
Gradient-Based IRL from Visual Demonstrations
...and 11 more sections

Figures (2)

Figure 1: A basic overview of our keypoint-based visual model predictive control framework for AIRL. Actions are optimized via Cross Entropy on the cost function.
Figure 2: Figures (a) to (k) represent the progressive performance of the robot in an instance of testing, after training based on the proposed IRL method. Figure (l) represents the change of loss and reward during the training of the model with respect to the number of episodes.

Robotic Arm Manipulation with Inverse Reinforcement Learning & TD-MPC

TL;DR

Abstract

Robotic Arm Manipulation with Inverse Reinforcement Learning & TD-MPC

Authors

TL;DR

Abstract

Table of Contents

Figures (2)