A Reactive Framework for Whole-Body Motion Planning of Mobile Manipulators Combining Reinforcement Learning and SDF-Constrained Quadratic Programmi

Chenyu Zhang; Shiying Sun; Kuan Liu; Chuanbao Zhou; Xiaoguang Zhao; Min Tan; Yanlong Huang

A Reactive Framework for Whole-Body Motion Planning of Mobile Manipulators Combining Reinforcement Learning and SDF-Constrained Quadratic Programmi

Chenyu Zhang, Shiying Sun, Kuan Liu, Chuanbao Zhou, Xiaoguang Zhao, Min Tan, Yanlong Huang

TL;DR

The paper tackles the challenge of efficient, safe whole-body motion planning for mobile manipulators with redundant DOFs in cluttered environments. It presents a hybrid framework that combines Bayes-DSAC, a distributional off-policy RL method with Bayesian controller fusion, for task-space velocity planning, and a robot-centric SDF-constrained QP for joint-space control, translating high-level commands into collision-free joint velocities. The framework defines a soft return distribution $Z_{pi}(s_t,a_t)$ and a Bayesian-fused hybrid distribution $Z^{hyb}$ to improve value estimation and convergence, while enforcing obstacle avoidance through $A_{avoi}$-based joint-space constraints and SDF queries. Experimental results show faster learning, higher planning efficiency, and improved safety across multiple cluttered scenarios, outperforming several strong baselines. The work advances reactive whole-body planning by integrating perception, learning, and optimization in a coherent, real-time capable pipeline with practical impact for autonomous service and industrial robotics.

Abstract

As an important branch of embodied artificial intelligence, mobile manipulators are increasingly applied in intelligent services, but their redundant degrees of freedom also limit efficient motion planning in cluttered environments. To address this issue, this paper proposes a hybrid learning and optimization framework for reactive whole-body motion planning of mobile manipulators. We develop the Bayesian distributional soft actor-critic (Bayes-DSAC) algorithm to improve the quality of value estimation and the convergence performance of the learning. Additionally, we introduce a quadratic programming method constrained by the signed distance field to enhance the safety of the obstacle avoidance motion. We conduct experiments and make comparison with standard benchmark. The experimental results verify that our proposed framework significantly improves the efficiency of reactive whole-body motion planning, reduces the planning time, and improves the success rate of motion planning. Additionally, the proposed reinforcement learning method ensures a rapid learning process in the whole-body planning task. The novel framework allows mobile manipulators to adapt to complex environments more safely and efficiently.

A Reactive Framework for Whole-Body Motion Planning of Mobile Manipulators Combining Reinforcement Learning and SDF-Constrained Quadratic Programmi

TL;DR

Abstract

A Reactive Framework for Whole-Body Motion Planning of Mobile Manipulators Combining Reinforcement Learning and SDF-Constrained Quadratic Programmi

TL;DR

Abstract

Paper Structure

Table of Contents

Figures (5)