Table of Contents
Fetching ...

RK-MPC: Residual Koopman Model Predictive Control for Quadruped Locomotion in Offroad Environments

Sriram S. K. S. Narayanan, Umesh Vaidya

Abstract

This paper presents Residual Koopman MPC (RK-MPC), a Koopman-based, data-driven model predictive control framework for quadruped locomotion that improves prediction fidelity while preserving real-time tractability. RK-MPC augments a nominal template model with a compact linear residual predictor learned from data in lifted coordinates, enabling systematic correction of model mismatch induced by contact variability and terrain disturbances with provable bounds on multi-step prediction error. The learned residual model is embedded within a convex quadratic-program MPC formulation, yielding a receding-horizon controller that runs onboard at 500 Hz and retains the structure and constraint-handling advantages of optimization-based control. We evaluate RK-MPC in both Gazebo simulation and Unitree Go1 hardware experiments, demonstrating reliable blind locomotion across contact disturbances, multiple gait schedules, and challenging off-road terrains including grass, gravel, snow, and ice. We further compare against Koopman/EDMD baselines using alternative observable dictionaries, including monomial and $SE(3)$-structured bases, and show that the residual correction improves multi-step prediction and closed-loop performance while reducing sensitivity to the choice of observables. Overall, RK-MPC provides a practical, hardware-validated pathway for data-driven predictive control of quadrupeds in unstructured environments. See https://sriram-2502.github.io/rk-mpc for implementation videos.

RK-MPC: Residual Koopman Model Predictive Control for Quadruped Locomotion in Offroad Environments

Abstract

This paper presents Residual Koopman MPC (RK-MPC), a Koopman-based, data-driven model predictive control framework for quadruped locomotion that improves prediction fidelity while preserving real-time tractability. RK-MPC augments a nominal template model with a compact linear residual predictor learned from data in lifted coordinates, enabling systematic correction of model mismatch induced by contact variability and terrain disturbances with provable bounds on multi-step prediction error. The learned residual model is embedded within a convex quadratic-program MPC formulation, yielding a receding-horizon controller that runs onboard at 500 Hz and retains the structure and constraint-handling advantages of optimization-based control. We evaluate RK-MPC in both Gazebo simulation and Unitree Go1 hardware experiments, demonstrating reliable blind locomotion across contact disturbances, multiple gait schedules, and challenging off-road terrains including grass, gravel, snow, and ice. We further compare against Koopman/EDMD baselines using alternative observable dictionaries, including monomial and -structured bases, and show that the residual correction improves multi-step prediction and closed-loop performance while reducing sensitivity to the choice of observables. Overall, RK-MPC provides a practical, hardware-validated pathway for data-driven predictive control of quadrupeds in unstructured environments. See https://sriram-2502.github.io/rk-mpc for implementation videos.

Paper Structure

This paper contains 33 sections, 1 theorem, 61 equations, 10 figures, 5 tables.

Key Result

Theorem 2

Let the true discrete-time system evolve as Let the nominal predictor be given by eq:nominal_ltv with the residual defined in eq:residual_error and the residual Koopman predictor defined in eq:residual_lifted. The combined state predictor is with the state prediction error Then, using Assumption assumption1, there exists a constant $\gamma>0$, depending only on $A^{\mathrm{res}},B^{\mathrm{res}

Figures (10)

  • Figure B1: Single rigid-body (SRB) model. The floating base evolves under stance ground reaction forces $f_i$ applied at the contact points with moment arms $r_i$, generating the net centroidal wrench that propels the robot forward.
  • Figure B2: RK-MPC locomotion stack. A hierarchical pipeline converts high-level velocity commands into joint-level torque/position commands. A state estimator provides the feedback state to the finite-state machine and the convex residual Koopman MPC, which generates optimal ground reaction forces. The resulting commands are tracked by the joint controller and executed on hardware.
  • Figure C1: Residual Koopman modeling framework
  • Figure E1: Dataset generation for Koopman residual learning. Left: body-velocity samples $(v_x,v_y)$ with marginal histograms, colored by yaw rate $\omega_z$, illustrating excitation coverage. Middle: Gazebo simulation setup with randomized off-road terrain with per-episode friction $\mu$. Right: representative episode time histories of base linear velocity $v$, angular velocity $\omega$, vertical contact forces $f_z$, and trot gait phase.
  • Figure E2: Prediction performance: Comparison of the proposed residual Koopman model (orange) against a nonlinear SRB (blue dashed), EDMD with monomials (green dashed) and EDMD with SE(3) basis (purple dashed against a test trajectory (black). (a) prediction of $x-y$ trajectory over 100 time steps. (b--c) Per-channel RMSE box plots for linear velocities $(v_x,v_y,v_z)$ and angular velocities $(\omega_x,\omega_y,\omega_z)$, (d) Mean $SO(3)$ geodesic attitude error of EDMD-$SE(3)$ over multi-step rollout horizon, (e) Overall residual Koopman RMSE versus the number of training samples used and (f) Overall RMSE error against monomial degree sweep for the residual Koopman model.
  • ...and 5 more figures

Theorems & Definitions (5)

  • Remark 1
  • Theorem 2
  • proof
  • Remark 3
  • Remark 4