Table of Contents
Fetching ...

PlaMo: Plan and Move in Rich 3D Physical Environments

Assaf Hallak, Gal Dalal, Chen Tessler, Kelly Guo, Shie Mannor, Gal Chechik

TL;DR

PlaMo addresses long-horizon, physics-consistent humanoid navigation in rich 3D environments by coupling a three-stage, scene-aware path planner with a reinforcement learning-based locomotion controller that adapts to terrain, obstacles, and dynamic objects. The high-level planner uses an $A^*$-based, slope-aware approach, followed by a path refiner and a $QP$-based speed controller to generate executable trajectories that respect the character's motion envelope. The low-level locomotion controller is trained with RL using AMP-based motion stylization and path-following rewards to produce natural, varied motions across four locomotion types. Experiments in IsaacGym demonstrate robust performance on unseen scenes, dynamic obstacles, and terrain variations, highlighting PlaMo's potential for NPCs and automated content in gaming and simulation contexts.

Abstract

Controlling humanoids in complex physically simulated worlds is a long-standing challenge with numerous applications in gaming, simulation, and visual content creation. In our setup, given a rich and complex 3D scene, the user provides a list of instructions composed of target locations and locomotion types. To solve this task we present PlaMo, a scene-aware path planner and a robust physics-based controller. The path planner produces a sequence of motion paths, considering the various limitations the scene imposes on the motion, such as location, height, and speed. Complementing the planner, our control policy generates rich and realistic physical motion adhering to the plan. We demonstrate how the combination of both modules enables traversing complex landscapes in diverse forms while responding to real-time changes in the environment. Video: https://youtu.be/wWlqSQlRZ9M .

PlaMo: Plan and Move in Rich 3D Physical Environments

TL;DR

PlaMo addresses long-horizon, physics-consistent humanoid navigation in rich 3D environments by coupling a three-stage, scene-aware path planner with a reinforcement learning-based locomotion controller that adapts to terrain, obstacles, and dynamic objects. The high-level planner uses an -based, slope-aware approach, followed by a path refiner and a -based speed controller to generate executable trajectories that respect the character's motion envelope. The low-level locomotion controller is trained with RL using AMP-based motion stylization and path-following rewards to produce natural, varied motions across four locomotion types. Experiments in IsaacGym demonstrate robust performance on unseen scenes, dynamic obstacles, and terrain variations, highlighting PlaMo's potential for NPCs and automated content in gaming and simulation contexts.

Abstract

Controlling humanoids in complex physically simulated worlds is a long-standing challenge with numerous applications in gaming, simulation, and visual content creation. In our setup, given a rich and complex 3D scene, the user provides a list of instructions composed of target locations and locomotion types. To solve this task we present PlaMo, a scene-aware path planner and a robust physics-based controller. The path planner produces a sequence of motion paths, considering the various limitations the scene imposes on the motion, such as location, height, and speed. Complementing the planner, our control policy generates rich and realistic physical motion adhering to the plan. We demonstrate how the combination of both modules enables traversing complex landscapes in diverse forms while responding to real-time changes in the environment. Video: https://youtu.be/wWlqSQlRZ9M .
Paper Structure (21 sections, 4 equations, 10 figures, 7 tables)

This paper contains 21 sections, 4 equations, 10 figures, 7 tables.

Figures (10)

  • Figure 1: Method overview. A simulated scene is provided to the path planner, together with a series of textual instructions, requesting a humanoid to reach landmarks in the scene using various locomotion types. The high-level planner computes a path that is fed to a reinforcement-learning-based low-level motion controller, which controls a humanoid to follow the path using the request locomotion type.
  • Figure 2: The three stages of our dynamic path planner: (i) $A^*$ solver, (ii) path refiner, and (iii) speed controller.
  • Figure 3: Locomotion controller module. The locomotion policy observes the character state $s_t$, the height map of the terrain $h$, and the requested path $\tau$. During training, the trajectories are randomly sampled. In inference, the trajectories consider the terrain characteristics, such as obstacles. After simulating the predicted action the resulting state is used for providing the style reward $r^\text{amp}$ and for the path following reward $r^\tau$.
  • Figure 4: Terrain dependent path planning The humanoid does not go through the shortest path between the two landmarks but prefers a smoother, flatter surface due to the height differences.
  • Figure 5: Randomized complex scenes We randomize several terrain types, walls and top obstacles to generate a variety of scenes.
  • ...and 5 more figures