Table of Contents
Fetching ...

On the Feasibility of A Mixed-Method Approach for Solving Long Horizon Task-Oriented Dexterous Manipulation

Shaunak A. Mehta, Rana Soltani Zarrin

TL;DR

The use of a mixed-method approach to solve for the long-horizon task of tool usage and it is shown that the proposed approach for each subtask outperforms the commonly adopted reinforcement learning approach across different subtasks and in performing the long horizon task in simulation.

Abstract

In-hand manipulation of tools using dexterous hands in real-world is an underexplored problem in the literature. In addition to more complex geometry and larger size of the tools compared to more commonly used objects like cubes or cylinders, task oriented in-hand tool manipulation involves many sub-tasks to be performed sequentially. This may involve reaching to the tool, picking it up, reorienting it in hand with or without regrasping to reach to a desired final grasp appropriate for the tool usage, and carrying the tool to the desired pose. Research on long-horizon manipulation using dexterous hands is rather limited and the existing work focus on learning the individual sub-tasks using a method like reinforcement learning (RL) and combine the policies for different subtasks to perform a long horizon task. However, in general a single method may not be the best for all the sub-tasks, and this can be more pronounced when dealing with multi-fingered hands manipulating objects with complex geometry like tools. In this paper, we investigate the use of a mixed-method approach to solve for the long-horizon task of tool usage and we use imitation learning, reinforcement learning and model based control. We also discuss a new RL-based teacher-student framework that combines real world data into offline training. We show that our proposed approach for each subtask outperforms the commonly adopted reinforcement learning approach across different subtasks and in performing the long horizon task in simulation. Finally we show the successful transferability to real world.

On the Feasibility of A Mixed-Method Approach for Solving Long Horizon Task-Oriented Dexterous Manipulation

TL;DR

The use of a mixed-method approach to solve for the long-horizon task of tool usage and it is shown that the proposed approach for each subtask outperforms the commonly adopted reinforcement learning approach across different subtasks and in performing the long horizon task in simulation.

Abstract

In-hand manipulation of tools using dexterous hands in real-world is an underexplored problem in the literature. In addition to more complex geometry and larger size of the tools compared to more commonly used objects like cubes or cylinders, task oriented in-hand tool manipulation involves many sub-tasks to be performed sequentially. This may involve reaching to the tool, picking it up, reorienting it in hand with or without regrasping to reach to a desired final grasp appropriate for the tool usage, and carrying the tool to the desired pose. Research on long-horizon manipulation using dexterous hands is rather limited and the existing work focus on learning the individual sub-tasks using a method like reinforcement learning (RL) and combine the policies for different subtasks to perform a long horizon task. However, in general a single method may not be the best for all the sub-tasks, and this can be more pronounced when dealing with multi-fingered hands manipulating objects with complex geometry like tools. In this paper, we investigate the use of a mixed-method approach to solve for the long-horizon task of tool usage and we use imitation learning, reinforcement learning and model based control. We also discuss a new RL-based teacher-student framework that combines real world data into offline training. We show that our proposed approach for each subtask outperforms the commonly adopted reinforcement learning approach across different subtasks and in performing the long horizon task in simulation. Finally we show the successful transferability to real world.

Paper Structure

This paper contains 20 sections, 4 equations, 7 figures.

Figures (7)

  • Figure 1: The execution of a long-horizon dexterous manipulation task. The robot arm first reaches for the tool using model-based policy $\pi_{MB}$, followed by grasping using a policy trained using imitation learning $\pi_{IL}$. The robot then performs in-hand manipulation using a reinforcement learning policy $\pi_{RL}$ and then using the $\pi_{MB}$ to carry the tool to the desired location.
  • Figure 2: Choosing a suitable method for solving a subtask. The flowchart highlights the different robot control strategies that can be used to solve a given subtask in a long-horizon dexterous manipulation task. It further provides the conditions for employing each method.
  • Figure 3: Imitation Learning Framework. In this approach, the gaussian noise is added to the human demonstrations to augment the data and an ensemble of $N$ networks is trained. Each network is trained independently to predict a series of $n$ actions in the future from a given state $s$.
  • Figure 4: Teacher-Student Framework Incorporating Real World Demonstrations. The teacher model is trained with privileged information that may not be easily available in the real world.. This teacher model is then used to supervise the training of the student model which is pretrained from demonstrations provided by an expert in the real world. Finally, the student model is deployed in the real world for fine-tuning in the real world without supervision to adapt to the hardware control and physical parameters.
  • Figure 5: Our proposed unified framework for combining different approaches to solve for a long horizon task. Here the high level policy determines the policy to be used based on the current state of the environment and the lower level policies are used to take action in the environment.
  • ...and 2 more figures