Table of Contents
Fetching ...

Data-Driven Physics Embedded Dynamics with Predictive Control and Reinforcement Learning for Quadrupeds

Prakrut Kotecha, Aditya Shirwatkar, Shishir Kolathaya

Abstract

State of the art quadrupedal locomotion approaches integrate Model Predictive Control (MPC) with Reinforcement Learning (RL), enabling complex motion capabilities with planning and terrain adaptive behaviors. However, they often face compounding errors over long horizons and have limited interpretability due to the absence of physical inductive biases. We address these issues by integrating Lagrangian Neural Networks (LNNs) into an RL MPC framework, enabling physically consistent dynamics learning. At deployment, our inverse dynamics infinite horizon MPC scheme avoids costly matrix inversions, improving computational efficiency by up to 4x with minimal loss of task performance. We validate our framework through multiple ablations of the proposed LNN and its variants. We show improved sample efficiency, reduced long-horizon error, and faster real time planning compared to unstructured neural dynamics. Lastly, we also test our framework on the Unitree Go1 robot to show real world viability.

Data-Driven Physics Embedded Dynamics with Predictive Control and Reinforcement Learning for Quadrupeds

Abstract

State of the art quadrupedal locomotion approaches integrate Model Predictive Control (MPC) with Reinforcement Learning (RL), enabling complex motion capabilities with planning and terrain adaptive behaviors. However, they often face compounding errors over long horizons and have limited interpretability due to the absence of physical inductive biases. We address these issues by integrating Lagrangian Neural Networks (LNNs) into an RL MPC framework, enabling physically consistent dynamics learning. At deployment, our inverse dynamics infinite horizon MPC scheme avoids costly matrix inversions, improving computational efficiency by up to 4x with minimal loss of task performance. We validate our framework through multiple ablations of the proposed LNN and its variants. We show improved sample efficiency, reduced long-horizon error, and faster real time planning compared to unstructured neural dynamics. Lastly, we also test our framework on the Unitree Go1 robot to show real world viability.
Paper Structure (28 sections, 11 equations, 6 figures, 3 tables, 2 algorithms)

This paper contains 28 sections, 11 equations, 6 figures, 3 tables, 2 algorithms.

Figures (6)

  • Figure 1: Overview of our inverse dynamics-based planner. At each control step, the method samples candidate joint trajectories, computes the corresponding torques, evaluates them using learned reward and value functions, and updates the distribution for the next sampling round.
  • Figure 2: Our training framework where observation history is encoded into full-state estimates, and passed through a physics-informed Dreamer module to augment expert actor with future predictions for robust policy learning.
  • Figure 3: Training loss curves across different dynamics models. Our method achieves strong sample efficiency while balancing compute cost, compared to LNN Forward and ONN baselines.
  • Figure 4: H-step prediction error of different models. Our approach maintains stable error across horizons, with significantly lower compounding error than ONN and CoM LNN.
  • Figure 5: Trade-off between MPC inference time and control performance across planning horizons. Our inverse-dynamics-based planner achieves competitive returns with significantly lower inference latency.
  • ...and 1 more figures