Table of Contents
Fetching ...

Learning-Based MPC for Fuel Efficient Control of Autonomous Vehicles with Discrete Gear Selection

Samuel Mallick, Gianpietro Battocletti, Qizhang Dong, Azita Dabiri, Bart De Schutter

TL;DR

This work presents a learning-based MPC framework to achieve fuel-efficient autonomous driving by co-optimizing speed and discrete gear shifts without solving computationally intractable MINLPs online. A recurrent neural network-based policy selects a gear-shift schedule over the MPC prediction horizon, fixing the discrete decisions and leaving a continuous NLP to handle the remaining optimization, while a feasibility backup ensures constraint satisfaction. The approach retains the benefits of joint speed-gear optimization with significantly reduced online computation, demonstrating comparable performance to mixed-integer baselines and robustness to disturbances. The method shows potential for real-time deployment and generalizes to longer horizons, offering practical impact for efficient autonomous vehicle control.

Abstract

Co-optimization of both vehicle speed and gear position via model predictive control (MPC) has been shown to offer benefits for fuel-efficient autonomous driving. However, optimizing both the vehicle's continuous dynamics and discrete gear positions may be too computationally intensive for a real-time implementation. This work proposes a learning-based MPC scheme to address this issue. A policy is trained to select and fix the gear positions across the prediction horizon of the MPC controller, leaving a significantly simpler continuous optimization problem to be solved online. In simulation, the proposed approach is shown to have a significantly lower computation burden and a comparable performance, with respect to pure MPC-based co-optimization.

Learning-Based MPC for Fuel Efficient Control of Autonomous Vehicles with Discrete Gear Selection

TL;DR

This work presents a learning-based MPC framework to achieve fuel-efficient autonomous driving by co-optimizing speed and discrete gear shifts without solving computationally intractable MINLPs online. A recurrent neural network-based policy selects a gear-shift schedule over the MPC prediction horizon, fixing the discrete decisions and leaving a continuous NLP to handle the remaining optimization, while a feasibility backup ensures constraint satisfaction. The approach retains the benefits of joint speed-gear optimization with significantly reduced online computation, demonstrating comparable performance to mixed-integer baselines and robustness to disturbances. The method shows potential for real-time deployment and generalizes to longer horizons, offering practical impact for efficient autonomous vehicle control.

Abstract

Co-optimization of both vehicle speed and gear position via model predictive control (MPC) has been shown to offer benefits for fuel-efficient autonomous driving. However, optimizing both the vehicle's continuous dynamics and discrete gear positions may be too computationally intensive for a real-time implementation. This work proposes a learning-based MPC scheme to address this issue. A policy is trained to select and fix the gear positions across the prediction horizon of the MPC controller, leaving a significantly simpler continuous optimization problem to be solved online. In simulation, the proposed approach is shown to have a significantly lower computation burden and a comparable performance, with respect to pure MPC-based co-optimization.

Paper Structure

This paper contains 11 sections, 1 theorem, 33 equations, 3 figures, 1 table, 1 algorithm.

Key Result

Proposition 1

Assume that, for $j \in \{1,\dots,6\}$ and for all $x_2 \in \Omega(j)$, there exist $u_1$ and $u_2$ such that $T_\text{min} \leq u_1 \leq T_\text{max}$, $F_\text{b, min} \leq u_2 \leq F_\text{b, max}$, and Then, for a state $x(k)$ such that $v_\mathrm{min} \leq x_2(k) \leq v_\mathrm{max}$, and a gear-shift sequence $\textbf{j}(k) = \sigma(x_2(k))$, problem eq:NLP has a solution, i.e., $J(x(k), \t

Figures (3)

  • Figure 1: Recurrent NN, with hidden states $h_i$, showing how the chain of inputs sequentially generates the gear-shift schedule. The maps $\psi$ and $\eta$ are input and output transformations.
  • Figure 2: Top: distribution of \ref{['eq:cost_increase']} across 100 episodes with median marked. Bottom: distribution of average MPC solve time for each episode across 100 episodes with median marked. The maximal time of all steps is marked with a red triangle.
  • Figure 3: Representative trajectories for each controller.

Theorems & Definitions (3)

  • Proposition 1
  • proof
  • proof