Table of Contents
Fetching ...

ALPHA-$α$ and Bi-ACT Are All You Need: Importance of Position and Force Information/Control for Imitation Learning of Unimanual and Bimanual Robotic Manipulation with Low-Cost System

Masato Kobayashi, Thanpimon Buamanee, Takumi Kobayashi

TL;DR

Bi-ACT combines bilateral control with Action Chunking Transformers to enable force-aware imitation learning for unimanual and bimanual robotic manipulation on a low-cost platform (ALPHA-$α$). By collecting joint angles, velocities, torques, and images from leader/follower pairs at high frequencies ($1000$ Hz for joints and ~$100$ Hz for images) and using a CVAE-based architecture, Bi-ACT achieves robust multi-step predictions that account for object hardness, shape, and weight distribution. Unimanual experiments show Bi-ACT with force control achieving near-perfect success across trained and several untrained objects, while ablations without force reveal limitations on larger or deformable items; bimanual experiments confirm high success rates across Put-Cup-Ball, Egg Handling, and Open Cap tasks. The work demonstrates that integrating position and force information on a low-cost bilateral platform substantially enhances manipulation capabilities and broadens access to bimanual robotics research.

Abstract

Autonomous manipulation in everyday tasks requires flexible action generation to handle complex, diverse real-world environments, such as objects with varying hardness and softness. Imitation Learning (IL) enables robots to learn complex tasks from expert demonstrations. However, a lot of existing methods rely on position/unilateral control, leaving challenges in tasks that require force information/control, like carefully grasping fragile or varying-hardness objects. As the need for diverse controls increases, there are demand for low-cost bimanual robots that consider various motor inputs. To address these challenges, we introduce Bilateral Control-Based Imitation Learning via Action Chunking with Transformers(Bi-ACT) and"A" "L"ow-cost "P"hysical "Ha"rdware Considering Diverse Motor Control Modes for Research in Everyday Bimanual Robotic Manipulation (ALPHA-$α$). Bi-ACT leverages bilateral control to utilize both position and force information, enhancing the robot's adaptability to object characteristics such as hardness, shape, and weight. The concept of ALPHA-$α$ is affordability, ease of use, repairability, ease of assembly, and diverse control modes (position, velocity, torque), allowing researchers/developers to freely build control systems using ALPHA-$α$. In our experiments, we conducted a detailed analysis of Bi-ACT in unimanual manipulation tasks, confirming its superior performance and adaptability compared to Bi-ACT without force control. Based on these results, we applied Bi-ACT to bimanual manipulation tasks. Experimental results demonstrated high success rates in coordinated bimanual operations across multiple tasks. The effectiveness of the Bi-ACT and ALPHA-$α$ can be seen through comprehensive real-world experiments. Video available at: https://mertcookimg.github.io/alpha-biact/

ALPHA-$α$ and Bi-ACT Are All You Need: Importance of Position and Force Information/Control for Imitation Learning of Unimanual and Bimanual Robotic Manipulation with Low-Cost System

TL;DR

Bi-ACT combines bilateral control with Action Chunking Transformers to enable force-aware imitation learning for unimanual and bimanual robotic manipulation on a low-cost platform (ALPHA-). By collecting joint angles, velocities, torques, and images from leader/follower pairs at high frequencies ( Hz for joints and ~ Hz for images) and using a CVAE-based architecture, Bi-ACT achieves robust multi-step predictions that account for object hardness, shape, and weight distribution. Unimanual experiments show Bi-ACT with force control achieving near-perfect success across trained and several untrained objects, while ablations without force reveal limitations on larger or deformable items; bimanual experiments confirm high success rates across Put-Cup-Ball, Egg Handling, and Open Cap tasks. The work demonstrates that integrating position and force information on a low-cost bilateral platform substantially enhances manipulation capabilities and broadens access to bimanual robotics research.

Abstract

Autonomous manipulation in everyday tasks requires flexible action generation to handle complex, diverse real-world environments, such as objects with varying hardness and softness. Imitation Learning (IL) enables robots to learn complex tasks from expert demonstrations. However, a lot of existing methods rely on position/unilateral control, leaving challenges in tasks that require force information/control, like carefully grasping fragile or varying-hardness objects. As the need for diverse controls increases, there are demand for low-cost bimanual robots that consider various motor inputs. To address these challenges, we introduce Bilateral Control-Based Imitation Learning via Action Chunking with Transformers(Bi-ACT) and"A" "L"ow-cost "P"hysical "Ha"rdware Considering Diverse Motor Control Modes for Research in Everyday Bimanual Robotic Manipulation (ALPHA-). Bi-ACT leverages bilateral control to utilize both position and force information, enhancing the robot's adaptability to object characteristics such as hardness, shape, and weight. The concept of ALPHA- is affordability, ease of use, repairability, ease of assembly, and diverse control modes (position, velocity, torque), allowing researchers/developers to freely build control systems using ALPHA-. In our experiments, we conducted a detailed analysis of Bi-ACT in unimanual manipulation tasks, confirming its superior performance and adaptability compared to Bi-ACT without force control. Based on these results, we applied Bi-ACT to bimanual manipulation tasks. Experimental results demonstrated high success rates in coordinated bimanual operations across multiple tasks. The effectiveness of the Bi-ACT and ALPHA- can be seen through comprehensive real-world experiments. Video available at: https://mertcookimg.github.io/alpha-biact/

Paper Structure

This paper contains 37 sections, 5 equations, 15 figures, 5 tables.

Figures (15)

  • Figure 1: Overview of Bilateral Control-Based Imitation Learning System using ALPHA-$\alpha$ and Bi-ACT
  • Figure 2: Image of Unilateral Control-Based Imitation Learning
  • Figure 3: Image of 4ch Bilateral Control-Based Imitation Learning
  • Figure 4: ALPHA-$\alpha$": "A" "L"ow-cost "P"hysical "Ha"rdware Considering Diverse Motor Control Modes for Research in Everyday Bimanual Robotic Manipulation
  • Figure 5: Image Diagram of Data Collection
  • ...and 10 more figures