CoinRobot: Generalized End-to-end Robotic Learning for Physical Intelligence
Yu Zhao, Huxian Liu, Xiang Chen, Jiankai Sun, Jiahuan Yan, Luhui Hu
TL;DR
CoinRobot addresses the challenge of generalizing end-to-end robotic learning across heterogeneous hardware and tasks. It combines a diffusion-based action policy with a modular perception and control stack, enabling cross-platform deployment and minimal task-specific customization. The work demonstrates seven real-world tasks with diffusion policies outperforming a LeRobot baseline and shows multi-task and cross-view generalization capabilities. This framework and its open-source datasets and models aim to democratize robust embodied intelligence across diverse robotic systems.
Abstract
Physical intelligence holds immense promise for advancing embodied intelligence, enabling robots to acquire complex behaviors from demonstrations. However, achieving generalization and transfer across diverse robotic platforms and environments requires careful design of model architectures, training strategies, and data diversity. Meanwhile existing systems often struggle with scalability, adaptability to heterogeneous hardware, and objective evaluation in real-world settings. We present a generalized end-to-end robotic learning framework designed to bridge this gap. Our framework introduces a unified architecture that supports cross-platform adaptability, enabling seamless deployment across industrial-grade robots, collaborative arms, and novel embodiments without task-specific modifications. By integrating multi-task learning with streamlined network designs, it achieves more robust performance than conventional approaches, while maintaining compatibility with varying sensor configurations and action spaces. We validate our framework through extensive experiments on seven manipulation tasks. Notably, Diffusion-based models trained in our framework demonstrated superior performance and generalizability compared to the LeRobot framework, achieving performance improvements across diverse robotic platforms and environmental conditions.
