PhyPlan: Generalizable and Rapid Physical Task Planning with Physics Informed Skill Networks for Robot Manipulators

Mudit Chopra; Abhinav Barnawal; Harshil Vagadia; Tamajit Banerjee; Shreshth Tuli; Souvik Chakraborty; Rohan Paul

PhyPlan: Generalizable and Rapid Physical Task Planning with Physics Informed Skill Networks for Robot Manipulators

Mudit Chopra, Abhinav Barnawal, Harshil Vagadia, Tamajit Banerjee, Shreshth Tuli, Souvik Chakraborty, Rohan Paul

TL;DR

PhyPlan presents a data-efficient framework that fuses physics-informed neural networks with a modified MCTS to enable rapid, generalizable planning for long-horizon physical tasks. By learning skill trajectories under governing dynamics and using GP corrections to align simulated rewards with real-world outcomes, it balances fast low-fidelity rollouts with selective high-fidelity checks. The method shows improved goal reachability, reduced regret, and stronger data efficiency on unseen 3D manipulation tasks with a Franka Emika arm, outperforming physics-uninformed baselines and model-free planners. This approach advances practical robot reasoning in contact-rich environments by integrating physics priors, structured planning, and online adaptation.

Abstract

Given the task of positioning a ball-like object to a goal region beyond direct reach, humans can often throw, slide, or rebound objects against the wall to attain the goal. However, enabling robots to reason similarly is non-trivial. Existing methods for physical reasoning are data-hungry and struggle with complexity and uncertainty inherent in the real world. This paper presents PhyPlan, a novel physics-informed planning framework that combines physics-informed neural networks (PINNs) with modified Monte Carlo Tree Search (MCTS) to enable embodied agents to perform dynamic physical tasks. PhyPlan leverages PINNs to simulate and predict outcomes of actions in a fast and accurate manner and uses MCTS for planning. It dynamically determines whether to consult a PINN-based simulator (coarse but fast) or engage directly with the actual environment (fine but slow) to determine optimal policy. Given an unseen task, PhyPlan can infer the sequence of actions and learn the latent parameters, resulting in a generalizable approach that can rapidly learn to perform novel physical tasks. Evaluation with robots in simulated 3D environments demonstrates the ability of our approach to solve 3D-physical reasoning tasks involving the composition of dynamic skills. Quantitatively, PhyPlan excels in several aspects: (i) it achieves lower regret when learning novel tasks compared to the state-of-the-art, (ii) it expedites skill learning and enhances the speed of physical reasoning, (iii) it demonstrates higher data efficiency compared to a physics un-informed approach.

PhyPlan: Generalizable and Rapid Physical Task Planning with Physics Informed Skill Networks for Robot Manipulators

TL;DR

Abstract

Paper Structure (15 sections, 3 equations, 11 figures, 2 tables, 1 algorithm)

This paper contains 15 sections, 3 equations, 11 figures, 2 tables, 1 algorithm.

Introduction
Related Works
Problem Setup
Technical Approach
Physics-Informed Skill Networks
Generalized Physical Task Planning
Evaluation Setup
Simulation Environment and Training Details
Baseline Approaches for Comparison
Results
Predictive Accuracy of Physical Skill Networks
Goal Reachability in Physical Reasoning Tasks
Generalization and Adaptation
Qualitative Analysis
Conclusions and Future Work

Figures (11)

Figure 1: Approach Overview. Clockwise from top. (1) The robot learns a model of physical skills such as hitting, swinging, sliding etc. using a Physics-Informed Neural Network (PINN). The network incorporates a coarse physics governing equation in its loss function and can predict the future trajectory of the interacting object as well as the latent physical parameters of the domain. Given a new task (2), a Monte-Carlo Tree Search (MCTS) (3) searches for a plan by exploring the space of skill compositions, i.e., sampling over continuous parameters e.g., the height of object release, the angle of a wedge with which the object may collide. Simulating the trajectories during physical interaction using a forward pass of a PINN is faster than simulating rich inter-object dynamic interactions in a high-fidelity physics simulator (with complex physical models). The MCTS plan search (4) periodically balances sampling with the PINNs with occasional rollouts in the high-fidelity simulator. The discrepancy in the reward between (i) predicted by PINN rollouts and rollout in the physics simulator is modelled using a Gaussian Process ($\mathcal{GP}$) that guides the MCTS towards the correct plan.
Figure 2: Adaptive Physical Task Planning in Bridge task. Let sub-actions $a_0, a_1, a_2$ represent the continuous space of bridge's orientation, pendulum's plane, and release angle, respectively. PINN-MCTS tree selects sub-action $a^j_i$ at depth $i$; predicts the value of the selected action sequence using PINN_Rollout; updates individual sub-action values and repeats for $\mathcal{K}$ iterations. Then, the best action sequence $A_t$ is executed in the environment, and reward model is updated using Gaussian-Process ($\mathcal{GP}$) for $t \leq T$ trials.
Figure 3: Benchmark Physical Reasoning Tasks. This paper considers four tasks (shown above) inspired from allen2020rapid and bakhtin2019phyre. Each tasks requires the robot to place a dynamic object (a movable ball) be placed in a goal region (a container). The ball is not in direct kinematic range of the robot. Hence, the robot must make use of sequential dynamic interactions such as hitting, sliding, rebounding etc. so that the ball can land close to the goal. Further, physical parameters such as the coefficient of friction are latent and must be inferred allowing the robot to generalize to new settings.
Figure 4: Predictive Accuracy of Skill Networks. Comparing prediction loss on validation dataset. PINN (Physics and Data) performs better than other networks due to guiding physics and exact data. Perception models are comparatively less accurate due to inherent noise in perceived data.
Figure 5: Training Data Efficiency. PINN requires less training data, collected at various time steps in each rollout, to acheive the same validation loss as NN.
...and 6 more figures

PhyPlan: Generalizable and Rapid Physical Task Planning with Physics Informed Skill Networks for Robot Manipulators

TL;DR

Abstract

PhyPlan: Generalizable and Rapid Physical Task Planning with Physics Informed Skill Networks for Robot Manipulators

Authors

TL;DR

Abstract

Table of Contents

Figures (11)