Table of Contents
Fetching ...

A Reinforcement Learning Approach to Non-prehensile Manipulation through Sliding

Hamidreza Raei, Elena De Momi, Arash Ajoudani

TL;DR

A Deep Deterministic Policy Gradient (DDPG) reinforcement learning (RL) framework for efficient non-prehensile manipulation, specifically for sliding an object on a surface, and exhibits zero-shot sim-to-real transfer capabilities.

Abstract

Although robotic applications increasingly demand versatile and dynamic object handling, most existing techniques are predominantly focused on grasp-based manipulation, limiting their applicability in non-prehensile tasks. To address this need, this study introduces a Deep Deterministic Policy Gradient (DDPG) reinforcement learning framework for efficient non-prehensile manipulation, specifically for sliding an object on a surface. The algorithm generates a linear trajectory by precisely controlling the acceleration of a robotic arm rigidly coupled to the horizontal surface, enabling the relative manipulation of an object as it slides on top of the surface. Furthermore, two distinct algorithms have been developed to estimate the frictional forces dynamically during the sliding process. These algorithms provide online friction estimates after each action, which are fed back into the actor model as critical feedback after each action. This feedback mechanism enhances the policy's adaptability and robustness, ensuring more precise control of the platform's acceleration in response to varying surface condition. The proposed algorithm is validated through simulations and real-world experiments. Results demonstrate that the proposed framework effectively generalizes sliding manipulation across varying distances and, more importantly, adapts to different surfaces with diverse frictional properties. Notably, the trained model exhibits zero-shot sim-to-real transfer capabilities.

A Reinforcement Learning Approach to Non-prehensile Manipulation through Sliding

TL;DR

A Deep Deterministic Policy Gradient (DDPG) reinforcement learning (RL) framework for efficient non-prehensile manipulation, specifically for sliding an object on a surface, and exhibits zero-shot sim-to-real transfer capabilities.

Abstract

Although robotic applications increasingly demand versatile and dynamic object handling, most existing techniques are predominantly focused on grasp-based manipulation, limiting their applicability in non-prehensile tasks. To address this need, this study introduces a Deep Deterministic Policy Gradient (DDPG) reinforcement learning framework for efficient non-prehensile manipulation, specifically for sliding an object on a surface. The algorithm generates a linear trajectory by precisely controlling the acceleration of a robotic arm rigidly coupled to the horizontal surface, enabling the relative manipulation of an object as it slides on top of the surface. Furthermore, two distinct algorithms have been developed to estimate the frictional forces dynamically during the sliding process. These algorithms provide online friction estimates after each action, which are fed back into the actor model as critical feedback after each action. This feedback mechanism enhances the policy's adaptability and robustness, ensuring more precise control of the platform's acceleration in response to varying surface condition. The proposed algorithm is validated through simulations and real-world experiments. Results demonstrate that the proposed framework effectively generalizes sliding manipulation across varying distances and, more importantly, adapts to different surfaces with diverse frictional properties. Notably, the trained model exhibits zero-shot sim-to-real transfer capabilities.

Paper Structure

This paper contains 20 sections, 15 equations, 11 figures, 1 algorithm.

Figures (11)

  • Figure 1: Illustration of the non-prehensile manipulation explored in this study: A robotic arm sliding an object using a controlled maneuver by following determined and commanding cartesian velocity to the robotic arm controller.
  • Figure 2: A simple Schematic of the platform and the sliding object where $\Sigma_B$ is the coordinate system fixed on the sliding object and $\Sigma_A$ is the coordinate system fixed on the platform.
  • Figure 3: This figure illustrate, the states, the environment random initialization and domain randomization on friction parameter, in the DDPG training framework.
  • Figure 4: A schematic of the evaluation framework. Connections between components for real-world experiments (solid lines) and simulations (dashed lines) are shown. The key difference lies in pose estimation: real-world setups rely on motion capture systems and inverse kinematics to determine the pose of the end-effector and the object, while simulations provide these parameters directly.
  • Figure 5: This plots illustrate performance of the actor network in sliding displacement under friction coefficient errors, compared to the analytically calculated optimal solution. Yellow highlights show the acceptable margin for errors, while gray highlights denote complete action failure to move the object.
  • ...and 6 more figures