Bilevel Optimization for Real-Time Control with Application to Locomotion Gait Generation

Zachary Olkin; Aaron D. Ames

Bilevel Optimization for Real-Time Control with Application to Locomotion Gait Generation

Zachary Olkin, Aaron D. Ames

TL;DR

This paper introduces a real-time bilevel optimization scheme that couples a low-level MPC with a high-level parameter optimizer to adapt control parameters on the fly. It derives gradient-based updates by differentiating through the QP-based MPC subproblem and employs Wolfe-condition line searches to ensure descent, with convergence guarantees under nominal assumptions. The method is applied to quadruped gait generation by parameterizing contact schedules via foot contact times and a spline-based representation, achieving real-time performance (≈90 Hz with 20 nodes) and improved disturbance rejection alongside new, diverse gaits. The results demonstrate both theoretical guarantees and practical viability, offering a scalable approach for online gait optimization compatible with existing MPC frameworks and competitive with CIMPC techniques.

Abstract

Model Predictive Control (MPC) is a common tool for the control of nonlinear, real-world systems, such as legged robots. However, solving MPC quickly enough to enable its use in real-time is often challenging. One common solution is given by real-time iterations, which does not solve the MPC problem to convergence, but rather close enough to give an approximate solution. In this paper, we extend this idea to a bilevel control framework where a "high-level" optimization program modifies a controller parameter of a "low-level" MPC problem which generates the control inputs and desired state trajectory. We propose an algorithm to iterate on this bilevel program in real-time and provide conditions for its convergence and improvements in stability. We then demonstrate the efficacy of this algorithm by applying it to a quadrupedal robot where the high-level problem optimizes a contact schedule in real-time. We show through simulation that the algorithm can yield improvements in disturbance rejection and optimality, while creating qualitatively new gaits.

Bilevel Optimization for Real-Time Control with Application to Locomotion Gait Generation

TL;DR

Abstract

Paper Structure (12 sections, 7 theorems, 30 equations, 3 figures, 3 tables, 1 algorithm)

This paper contains 12 sections, 7 theorems, 30 equations, 3 figures, 3 tables, 1 algorithm.

Introduction
Algorithm and Theoretical Properties
Problem Setup
Main Result
Extension for Constraints
Application to Gait Generation
High-Level Optimization
Model
Spline Parameterization
Constraints
Simulation Results
Conclusion

Key Result

Proposition 1

Let the QP subproblem associated with $\omega(\theta) \in \Omega$ be denoted as where $\theta$ is a parameter. Let $J$ denote the optimal cost. If at the optimal solution $(z^*, \lambda^*, \nu^*)$ (with Lagrange multipliers $\lambda^*$ and $\nu^*$) $\omega$ is smooth with respect to $\theta$; $Q \succ 0$; the Linear Independence Constraint Qualification (LICQ)See definition 12. where in particula

Figures (3)

Figure 1: Structure of the bilevel optimization. The MPC uses parameters from the high-level optimization and outputs the control inputs and state trajectory. When applied to the quadruped, the high-level parameter is the contact schedule. The green dots indicate contact with the ground and show how the contact schedule changes over time.
Figure 2: Demonstration of diagonal walking. We observe that throughout the motion the robot achieves a number of different contact sequences. The green circles denote feet in contact with the ground. We start in a diagonal trot gait (diagonal legs move together) and by the end, the trajectory "disappears" as the high-level optimizer adjusts the parameter such that its legs no longer move. The graph on the left shows the planned contact state times in blue and actual contact states in green. The middle graph plots MPC ground reaction forces which depend directly on the times computed in the high-level optimization on the left. The right-most graph shows the cost function over time.
Figure 3: Snapshots of the robot while recovering from a disturbance in the $y$ direction, which is signified by the red arrow in the first tile. Observe in the upper right image that post-disturbance, three feet are off the ground while the original gait specified only two feet in the air at once. At the end, the robot's legs stop moving as one might naturally expect.

Theorems & Definitions (16)

Definition 1
Proposition 1
proof
Lemma 1
proof
Lemma 2
proof
Theorem 1
proof
Lemma 3
...and 6 more

Bilevel Optimization for Real-Time Control with Application to Locomotion Gait Generation

TL;DR

Abstract

Bilevel Optimization for Real-Time Control with Application to Locomotion Gait Generation

Authors

TL;DR

Abstract

Table of Contents

Key Result

Figures (3)

Theorems & Definitions (16)