Multi-Objective LQR with Linear Scalarization
Ali Jadbabaie, Devavrat Shah, Sean R. Sinclair
TL;DR
The paper addresses multi-objective decision-making in the infinite-horizon LQR setting by proving that the Pareto front for MObjLQR is exactly captured by linear scalarization, i.e., all trade-offs can be obtained by solving single-objective LQR problems with cost matrices $Q_w=\sum_i w_i Q_i$ and $R_w=\sum_i w_i R_i$ for weights $w\in\Delta([m])$. It introduces a grid-based algorithm that discretizes the weight simplex, leveraging standard LQR solvers to approximate the Pareto front with explicit $\epsilon$-accuracy guarantees, aided by perturbation theory for the discrete Riccati equation. A key result is the smoothness of the Pareto front: an $\epsilon$-perturbation in the scalarization parameter yields an $O(\epsilon)$ change in the objective, enabling uniform approximation bounds over all objectives. The methodology extends to certainty equivalence with estimated dynamics, maintaining the same order of approximation guarantees under bounded model error, thereby offering a practical and theoretically grounded approach for multi-objective control in real-world systems.
Abstract
The framework of decision-making, modeled as a Markov Decision Process (MDP), typically assumes a single objective. However, practical scenarios often involve tradeoffs between multiple objectives. We address this in the Linear Quadratic Regulator (LQR), a canonical continuous, infinite horizon MDP. First, we establish that the Pareto front for LQR is characterized by linear scalarization: a convex combination of objectives recovers all tradeoff points, making multi-objective LQR reducible to single-objective problems. This highlights an important instance where linear scalarization suffices for a non-convex problem. Second, we show the Pareto front is smooth, in that an $ε$ perturbation of a scalarization parameter yields an $ε$ approximation to the objective. These results inspire a simple algorithm to approximate the Pareto front via grid search over scalarization parameters, where each optimization problem retains the computational efficiency of single-objective LQR. Lastly, we extend the analysis to certainty equivalence, where unknown dynamics are replaced with estimates.
