ALPHA-$α$ and Bi-ACT Are All You Need: Importance of Position and Force Information/Control for Imitation Learning of Unimanual and Bimanual Robotic Manipulation with Low-Cost System
Masato Kobayashi, Thanpimon Buamanee, Takumi Kobayashi
TL;DR
Bi-ACT combines bilateral control with Action Chunking Transformers to enable force-aware imitation learning for unimanual and bimanual robotic manipulation on a low-cost platform (ALPHA-$α$). By collecting joint angles, velocities, torques, and images from leader/follower pairs at high frequencies ($1000$ Hz for joints and ~$100$ Hz for images) and using a CVAE-based architecture, Bi-ACT achieves robust multi-step predictions that account for object hardness, shape, and weight distribution. Unimanual experiments show Bi-ACT with force control achieving near-perfect success across trained and several untrained objects, while ablations without force reveal limitations on larger or deformable items; bimanual experiments confirm high success rates across Put-Cup-Ball, Egg Handling, and Open Cap tasks. The work demonstrates that integrating position and force information on a low-cost bilateral platform substantially enhances manipulation capabilities and broadens access to bimanual robotics research.
Abstract
Autonomous manipulation in everyday tasks requires flexible action generation to handle complex, diverse real-world environments, such as objects with varying hardness and softness. Imitation Learning (IL) enables robots to learn complex tasks from expert demonstrations. However, a lot of existing methods rely on position/unilateral control, leaving challenges in tasks that require force information/control, like carefully grasping fragile or varying-hardness objects. As the need for diverse controls increases, there are demand for low-cost bimanual robots that consider various motor inputs. To address these challenges, we introduce Bilateral Control-Based Imitation Learning via Action Chunking with Transformers(Bi-ACT) and"A" "L"ow-cost "P"hysical "Ha"rdware Considering Diverse Motor Control Modes for Research in Everyday Bimanual Robotic Manipulation (ALPHA-$α$). Bi-ACT leverages bilateral control to utilize both position and force information, enhancing the robot's adaptability to object characteristics such as hardness, shape, and weight. The concept of ALPHA-$α$ is affordability, ease of use, repairability, ease of assembly, and diverse control modes (position, velocity, torque), allowing researchers/developers to freely build control systems using ALPHA-$α$. In our experiments, we conducted a detailed analysis of Bi-ACT in unimanual manipulation tasks, confirming its superior performance and adaptability compared to Bi-ACT without force control. Based on these results, we applied Bi-ACT to bimanual manipulation tasks. Experimental results demonstrated high success rates in coordinated bimanual operations across multiple tasks. The effectiveness of the Bi-ACT and ALPHA-$α$ can be seen through comprehensive real-world experiments. Video available at: https://mertcookimg.github.io/alpha-biact/
