Table of Contents
Fetching ...

RoboPocket: Improve Robot Policies Instantly with Your Phone

Junjie Fang, Wendi Chen, Han Xue, Fangyuan Zhou, Tian Le, Yi Wang, Yuting Zhang, Jun Lv, Chuan Wen, Cewu Lu

TL;DR

RoboPocket is a portable system that enables Robot-Free Instant Policy Iteration using single consumer smartphones with a Remote Inference framework that visualizes the policy's predicted trajectory via Augmented Reality (AR) Visual Foresight, and implements an asynchronous Online Finetuning pipeline that continuously updates the policy with incoming data, effectively closing the learning loop in minutes.

Abstract

Scaling imitation learning is fundamentally constrained by the efficiency of data collection. While handheld interfaces have emerged as a scalable solution for in-the-wild data acquisition, they predominantly operate in an open-loop manner: operators blindly collect demonstrations without knowing the underlying policy's weaknesses, leading to inefficient coverage of critical state distributions. Conversely, interactive methods like DAgger effectively address covariate shift but rely on physical robot execution, which is costly and difficult to scale. To reconcile this trade-off, we introduce RoboPocket, a portable system that enables Robot-Free Instant Policy Iteration using single consumer smartphones. Its core innovation is a Remote Inference framework that visualizes the policy's predicted trajectory via Augmented Reality (AR) Visual Foresight. This immersive feedback allows collectors to proactively identify potential failures and focus data collection on the policy's weak regions without requiring a physical robot. Furthermore, we implement an asynchronous Online Finetuning pipeline that continuously updates the policy with incoming data, effectively closing the learning loop in minutes. Extensive experiments demonstrate that RoboPocket adheres to data scaling laws and doubles the data efficiency compared to offline scaling strategies, overcoming their long-standing efficiency bottleneck. Moreover, our instant iteration loop also boosts sample efficiency by up to 2$\times$ in distributed environments a small number of interactive corrections per person. Project page and videos: https://robo-pocket.github.io.

RoboPocket: Improve Robot Policies Instantly with Your Phone

TL;DR

RoboPocket is a portable system that enables Robot-Free Instant Policy Iteration using single consumer smartphones with a Remote Inference framework that visualizes the policy's predicted trajectory via Augmented Reality (AR) Visual Foresight, and implements an asynchronous Online Finetuning pipeline that continuously updates the policy with incoming data, effectively closing the learning loop in minutes.

Abstract

Scaling imitation learning is fundamentally constrained by the efficiency of data collection. While handheld interfaces have emerged as a scalable solution for in-the-wild data acquisition, they predominantly operate in an open-loop manner: operators blindly collect demonstrations without knowing the underlying policy's weaknesses, leading to inefficient coverage of critical state distributions. Conversely, interactive methods like DAgger effectively address covariate shift but rely on physical robot execution, which is costly and difficult to scale. To reconcile this trade-off, we introduce RoboPocket, a portable system that enables Robot-Free Instant Policy Iteration using single consumer smartphones. Its core innovation is a Remote Inference framework that visualizes the policy's predicted trajectory via Augmented Reality (AR) Visual Foresight. This immersive feedback allows collectors to proactively identify potential failures and focus data collection on the policy's weak regions without requiring a physical robot. Furthermore, we implement an asynchronous Online Finetuning pipeline that continuously updates the policy with incoming data, effectively closing the learning loop in minutes. Extensive experiments demonstrate that RoboPocket adheres to data scaling laws and doubles the data efficiency compared to offline scaling strategies, overcoming their long-standing efficiency bottleneck. Moreover, our instant iteration loop also boosts sample efficiency by up to 2 in distributed environments a small number of interactive corrections per person. Project page and videos: https://robo-pocket.github.io.
Paper Structure (49 sections, 1 equation, 14 figures, 2 tables)

This paper contains 49 sections, 1 equation, 14 figures, 2 tables.

Figures (14)

  • Figure 1: Unlike previous workflows (left) that rely on prolonged offline feedback loops with physical robots, RoboPocket (right) enables instant policy updates in distributed environments using a consumer smartphone. By visualizing the policy's intent via AR Visual Foresight, users can proactively identify weaknesses and provide corrective data that refines the policy in minutes.
  • Figure 2: RoboPocket System Design. (a) Hardware Design: The system features a low-cost, 3D-printed adaptive gripper that is isomorphic to the Robotiq 2F-85 to ensure physical consistency. A custom mount with a fisheye lens expands the iPhone's visual context, while an ESP32-based interface board captures gripper width via a magnetic encoder. (b) Software Interface: The iOS application acts as an edge-computing hub, providing real-time AR feedback for quality feedback and kinematic validity, visualizing the simulated robot, and enabling spatiotemporal synchronization for multi-device setups.
  • Figure 3: Overview of Robot-Free Instant Policy Iteration. (Left) Using AR Visual Foresight, the user identifies policy weaknesses (OOD states) and proactive failures in the real world, which improves the policy instantly. (Right) Collected corrective data is immediately streamed to the Data Serving Node. The Training Server performs online finetuning using weighted sampling (RLPD) and pushes updated weights to the Inference Server. The improved policy predictions are streamed back to iPhone in real-time ($<150$ms), enabling continuous, robot-free policy improvement in minutes.
  • Figure 4: Evaluation Tasks.a) We evaluate our method on four manipulation tasks—Block Sorting, Seasoning Pouring, Towel Folding, and Snack Bagging. b) The Mouse Arrangement task serves as a validation of data validity and data scaling laws. c) To assess distributed generalization, 4 users collect data and perform in-the-wild policy iteration across distinct environments for the Block Sorting task.
  • Figure 5: RoboPocket Localization Accuracy Evaluation. We measure the cumulative 3D Euclidean error of the trajectories against robot kinematic ground truth.
  • ...and 9 more figures