A two-stage search framework for constrained multi-gradient descent
Yuan-Zheng Lei, Yaobang Gong, Xianfeng Terry Yang
TL;DR
This work addresses constrained multi-objective optimization by extending the multi-gradient descent framework (MGDA) with a two-stage search to ensure valid descent directions under convex constraints. Stage 1 solves a min-max problem to obtain a weak Pareto stationary direction, while Stage 2 refines to achieve full Pareto stationarity by minimizing the lower bound of directional derivatives; both stages reduce to linear programs when constraints are linear. The approach is validated on a toy problem, a multi-regime fundamental diagram calibration, and a large-scale portfolio optimization, where it consistently yields balanced Pareto fronts and outperforms NSGA-II/NSGA-III in solution quality and reliability, especially in constrained or high-dimensional settings. The results demonstrate the method’s practical impact for large-scale, constrained multi-objective problems and invite extensions to non-convex constraints via linearization.
Abstract
The multi-gradient descent algorithm (MGDA) finds a common descent direction that can improve all objectives by identifying the minimum-norm point in the convex hull of the objective gradients. This method has become a foundational tool in large-scale multi-objective optimization, particularly in multi-task learning. However, MGDA may struggle with constrained problems, whether constraints are incorporated into the gradient hull or handled via projection onto the feasible region. To address this limitation, we propose a two-stage search algorithm for constrained multi-objective optimization. The first stage formulates a min-max problem that minimizes the upper bound of directional derivatives under constraints, yielding a weakly Pareto stationary solution with balanced progress across objectives. The second stage refines this solution by minimizing the lower bound of directional derivatives to achieve full Pareto stationarity. We evaluate the proposed method on three numerical examples. In a simple case with a known analytical Pareto front, our algorithm converges rapidly. In more complex real-world problems, it consistently outperforms the evolutionary baselines NSGA-II and NSGA-III.
