CycleManip: Enabling Cyclic Task Manipulation via Effective Historical Perception and Understanding

Yi-Lin Wei; Haoran Liao; Yuhao Lin; Pengyue Wang; Zhizhao Liang; Guiliang Liu; Wei-Shi Zheng

CycleManip: Enabling Cyclic Task Manipulation via Effective Historical Perception and Understanding

Yi-Lin Wei, Haoran Liao, Yuhao Lin, Pengyue Wang, Zhizhao Liang, Guiliang Liu, Wei-Shi Zheng

TL;DR

CycleManip addresses the challenge of cyclic manipulation by enhancing historical perception and understanding to reliably execute tasks for a specified number of cycles. It combines a cost-aware history sampling strategy with a multi-task objective for cycle-progress prediction, enabling an end-to-end imitation policy without extra modules. A RoboTwin_2.0-based benchmark with automated cycle evaluation supports rigorous evaluation, and results show superior performance across simulation and diverse real-world robotic platforms, with plug-and-play compatibility for Vision-Language-Action models. The work offers a practical path toward autonomous, adaptable cyclic manipulation in real-world settings.

Abstract

In this paper, we explore an important yet underexplored task in robot manipulation: cycle-based manipulation, where robots need to perform cyclic or repetitive actions with an expected terminal time. These tasks are crucial in daily life, such as shaking a bottle or knocking a nail. However, few prior works have explored this task, leading to two main challenges: 1) the imitation methods often fail to complete these tasks within the expected terminal time due to the ineffective utilization of history; 2) the absence of a benchmark with sufficient data and automatic evaluation tools hinders development of effective solutions in this area. To address these challenges, we first propose the CycleManip framework to achieve cycle-based task manipulation in an end-to-end imitation manner without requiring any extra models, hierarchical structure or significant computational overhead. The core insight is to enhance effective history perception by a cost-aware sampling strategy and to improve historical understanding by multi-task learning. Second, we introduce a cycle-based task manipulation benchmark, which provides diverse cycle-based tasks, and an automatic evaluation method. Extensive experiments conducted in both simulation and real-world settings demonstrate that our method achieves high success rates in cycle-based task manipulation. The results further show strong adaptability performance in general manipulation, and the plug-and-play ability on imitation policies such as Vision-Language-Action (VLA) models. Moreover, the results show that our approach can be applied across diverse robotic platforms, including bi-arm grippers, dexterous hands, and humanoid robots.

CycleManip: Enabling Cyclic Task Manipulation via Effective Historical Perception and Understanding

TL;DR

Abstract

CycleManip: Enabling Cyclic Task Manipulation via Effective Historical Perception and Understanding

TL;DR

Abstract

Paper Structure

Table of Contents

Figures (5)