Table of Contents
Fetching ...

Data efficient Robotic Object Throwing with Model-Based Reinforcement Learning

Niccolò Turcato, Giulio Giacomuzzo, Matteo Terreran, Davide Allegro, Ruggero Carli, Alberto Dalla Libera

TL;DR

This work presents MC-PILOT, a Model-Based Reinforcement Learning framework that enables data-efficient robotic object throwing by learning a probabilistic dynamics model via Gaussian Process Regression and optimizing a release-velocity policy under release-delay uncertainties. It extends MC-PILCO by accommodating target-domain variation, gripper delays, and drag, using an augmented state and Monte Carlo policy optimization with the reparameterization trick. The method demonstrates rapid generalization to unseen targets and objects, outperforming analytical and Model-Free baselines in simulation and on a Franka Panda, and it can adapt to new task requirements with minimal additional data. By explicitly modeling delays and environmental uncertainties, MC-PILOT offers a practical route to expanding robot workspace through Pick-and-Throw with high data efficiency and robustness.

Abstract

Pick-and-place (PnP) operations, featuring object grasping and trajectory planning, are fundamental in industrial robotics applications. Despite many advancements in the field, PnP is limited by workspace constraints, reducing flexibility. Pick-and-throw (PnT) is a promising alternative where the robot throws objects to target locations, leveraging extrinsic resources like gravity to improve efficiency and expand the workspace. However, PnT execution is complex, requiring precise coordination of high-speed movements and object dynamics. Solutions to the PnT problem are categorized into analytical and learning-based approaches. Analytical methods focus on system modeling and trajectory generation but are time-consuming and offer limited generalization. Learning-based solutions, in particular Model-Free Reinforcement Learning (MFRL), offer automation and adaptability but require extensive interaction time. This paper introduces a Model-Based Reinforcement Learning (MBRL) framework, MC-PILOT, which combines data-driven modeling with policy optimization for efficient and accurate PnT tasks. MC-PILOT accounts for model uncertainties and release errors, demonstrating superior performance in simulations and real-world tests with a Franka Emika Panda manipulator. The proposed approach generalizes rapidly to new targets, offering advantages over analytical and Model-Free methods.

Data efficient Robotic Object Throwing with Model-Based Reinforcement Learning

TL;DR

This work presents MC-PILOT, a Model-Based Reinforcement Learning framework that enables data-efficient robotic object throwing by learning a probabilistic dynamics model via Gaussian Process Regression and optimizing a release-velocity policy under release-delay uncertainties. It extends MC-PILCO by accommodating target-domain variation, gripper delays, and drag, using an augmented state and Monte Carlo policy optimization with the reparameterization trick. The method demonstrates rapid generalization to unseen targets and objects, outperforming analytical and Model-Free baselines in simulation and on a Franka Panda, and it can adapt to new task requirements with minimal additional data. By explicitly modeling delays and environmental uncertainties, MC-PILOT offers a practical route to expanding robot workspace through Pick-and-Throw with high data efficiency and robustness.

Abstract

Pick-and-place (PnP) operations, featuring object grasping and trajectory planning, are fundamental in industrial robotics applications. Despite many advancements in the field, PnP is limited by workspace constraints, reducing flexibility. Pick-and-throw (PnT) is a promising alternative where the robot throws objects to target locations, leveraging extrinsic resources like gravity to improve efficiency and expand the workspace. However, PnT execution is complex, requiring precise coordination of high-speed movements and object dynamics. Solutions to the PnT problem are categorized into analytical and learning-based approaches. Analytical methods focus on system modeling and trajectory generation but are time-consuming and offer limited generalization. Learning-based solutions, in particular Model-Free Reinforcement Learning (MFRL), offer automation and adaptability but require extensive interaction time. This paper introduces a Model-Based Reinforcement Learning (MBRL) framework, MC-PILOT, which combines data-driven modeling with policy optimization for efficient and accurate PnT tasks. MC-PILOT accounts for model uncertainties and release errors, demonstrating superior performance in simulations and real-world tests with a Franka Emika Panda manipulator. The proposed approach generalizes rapidly to new targets, offering advantages over analytical and Model-Free methods.

Paper Structure

This paper contains 25 sections, 39 equations, 16 figures, 2 tables, 1 algorithm.

Figures (16)

  • Figure 1: Panda Robot executing the throwing task with target bin.
  • Figure 2: Screenshots from the simulation, robot in release configuration. The bin is placed in an example position. The target position $\boldsymbol{P}$ is the top opening of the cylinder. The RBG axis triplet is the robot's reference frame.
  • Figure 3: The robot performs a throwing motion.
  • Figure 4: Reference joint trajectory and actual trajectory recorded on the robot, for the tossing motion. The trajectory moves only three joints. Plot shows the trajectories related to the minimum (green) and maximum (red) cartesian velocity. The vertical dashed lines show the nominal release time. The horizontal dashed lines are desired joints position and velocity at release.
  • Figure 5: Results of the Neural Network policy trained on datasets of increasing size. The target positions projected on the horizontal plane are colored green if the target was reached, and red if instead it was missed. The black markers represent the points of the training datasets.
  • ...and 11 more figures

Theorems & Definitions (2)

  • Remark
  • Remark