Table of Contents
Fetching ...

PhaForce: Phase-Scheduled Visual-Force Policy Learning with Slow Planning and Fast Correction for Contact-Rich Manipulation

Mingxin Wang, Zhirun Yue, Renhao Lu, Yizhe Li, Zihan Wang, Guoping Pan, Kangkang Dong, Jun Cheng, Yi Cheng, Houde Liu

TL;DR

PhaForce is proposed, a phase-scheduled visual--force policy that coordinates low-rate chunk-level planning and high-rate residual correction via a unified contact/phase schedule that achieves an average success rate of 86% and substantially improving contact quality by regulating interaction forces and exhibiting robust adaptability to OOD geometric shifts.

Abstract

Contact-rich manipulation requires not only vision-dominant task semantics but also closed-loop reactions to force/torque (F/T) transients. Yet, generative visuomotor policies are typically constrained to low-frequency updates due to inference latency and action chunking, underutilizing F/T for control-rate feedback. Furthermore, existing force-aware methods often inject force continuously and indiscriminately, lacking an explicit mechanism to schedule when / how much / where to apply force across different task phases. We propose PhaForce, a phase-scheduled visual--force policy that coordinates low-rate chunk-level planning and high-rate residual correction via a unified contact/phase schedule. PhaForce comprises (i) a contact-aware phase predictor (CAP) that estimates contact probability and phase belief, (ii) a Slow diffusion planner that performs dual-gated visual--force fusion with orthogonal residual injection to preserve vision semantics while conditioning on force, and (iii) a Fast corrector that applies control-rate phase-routed residuals in interpretable corrective subspaces for within-chunk micro-adjustments. Across multiple real-robot contact-rich tasks, PhaForce achieves an average success rate of 86% (+40 pp over baselines), while also substantially improving contact quality by regulating interaction forces and exhibiting robust adaptability to OOD geometric shifts.

PhaForce: Phase-Scheduled Visual-Force Policy Learning with Slow Planning and Fast Correction for Contact-Rich Manipulation

TL;DR

PhaForce is proposed, a phase-scheduled visual--force policy that coordinates low-rate chunk-level planning and high-rate residual correction via a unified contact/phase schedule that achieves an average success rate of 86% and substantially improving contact quality by regulating interaction forces and exhibiting robust adaptability to OOD geometric shifts.

Abstract

Contact-rich manipulation requires not only vision-dominant task semantics but also closed-loop reactions to force/torque (F/T) transients. Yet, generative visuomotor policies are typically constrained to low-frequency updates due to inference latency and action chunking, underutilizing F/T for control-rate feedback. Furthermore, existing force-aware methods often inject force continuously and indiscriminately, lacking an explicit mechanism to schedule when / how much / where to apply force across different task phases. We propose PhaForce, a phase-scheduled visual--force policy that coordinates low-rate chunk-level planning and high-rate residual correction via a unified contact/phase schedule. PhaForce comprises (i) a contact-aware phase predictor (CAP) that estimates contact probability and phase belief, (ii) a Slow diffusion planner that performs dual-gated visual--force fusion with orthogonal residual injection to preserve vision semantics while conditioning on force, and (iii) a Fast corrector that applies control-rate phase-routed residuals in interpretable corrective subspaces for within-chunk micro-adjustments. Across multiple real-robot contact-rich tasks, PhaForce achieves an average success rate of 86% (+40 pp over baselines), while also substantially improving contact quality by regulating interaction forces and exhibiting robust adaptability to OOD geometric shifts.
Paper Structure (17 sections, 14 equations, 5 figures, 5 tables)

This paper contains 17 sections, 14 equations, 5 figures, 5 tables.

Figures (5)

  • Figure 1: Comparison of three force-aware policy architectures. Prior works fuse vision and force into a single generative policy, while RDP adopts a slow--fast decomposition without explicit phase scheduling. PhaForce introduces an explicit contact/phase schedule to coordinate force usage for both chunk-level planning (Slow) and within-chunk correction (Fast).
  • Figure 2: PhaForce Architecture. The Slow diffusion planner runs at $f_s{=}6$ Hz to generate action chunks, while CAP and the Fast corrector run at the control rate $f_c{=}24$ Hz for contact/phase prediction and within-chunk closed-loop correction. In Slow, dual-gated vision--force fusion with orthogonal residual injection preserves vision-dominant task semantics. In Fast, phase-belief soft routing activates corrective subspaces and outputs a residual correction that is composed with the Slow base action to obtain the executed command.
  • Figure 3: We design five contact-rich tasks; each task exhibits varying contact states and phase belief, and each phase activates different corrective subspaces. Each row illustrates the phase transitions in a task. PhaForce not only excels on in-distribution tasks but also remains stable under OOD shifts.
  • Figure 4: Three common baseline failure modes in plug-in tasks.
  • Figure 5: We visualize the contact probability, phase belief, and $z$-axis force $F_z$ in a USB Plug-in task (curves are smoothed for visualization).