RoboCAS: A Benchmark for Robotic Manipulation in Complex Object Arrangement Scenarios
Liming Zheng, Feng Yan, Fanfan Liu, Chengjian Feng, Zhuoliang Kang, Lin Ma
TL;DR
RoboCAS introduces a first benchmark focused on long-horizon robotic manipulation in complex object arrangements, addressing clutter, occlusion, and inter-object interference under language instructions. Built in a realistic SAPIEN-based simulation with real-object scans, it enables automated scripted data generation across scattered, orderly, and stacked layouts for picking, selecting, and searching tasks. Experimental results with RT-1 and RoboFlamingo show meaningful success in simple layouts but substantial gaps in stacked, cluttered scenarios, underscoring the need for advanced spatial reasoning and chain-reaction understanding. The benchmark provides a cost-effective platform to drive progress in embodied AI toward robust, real-world manipulation under ambiguous language and complex environments.
Abstract
Foundation models hold significant potential for enabling robots to perform long-horizon general manipulation tasks. However, the simplicity of tasks and the uniformity of environments in existing benchmarks restrict their effective deployment in complex scenarios. To address this limitation, this paper introduces the \textit{RoboCAS} benchmark, the first benchmark specifically designed for complex object arrangement scenarios in robotic manipulation. This benchmark employs flexible and concise scripted policies to efficiently collect a diverse array of demonstrations, showcasing scattered, orderly, and stacked object arrangements within a highly realistic physical simulation environment. It includes complex processes such as target retrieval, obstacle clearance, and robot manipulation, testing agents' abilities to perform long-horizon planning for spatial reasoning and predicting chain reactions under ambiguous instructions. Extensive experiments on multiple baseline models reveal their limitations in managing complex object arrangement scenarios, underscoring the urgent need for intelligent agents capable of performing long-horizon operations in practical deployments and providing valuable insights for future research directions. Project website: \url{https://github.com/notFoundThisPerson/RoboCAS-v0}.
