Learning Multiple Initial Solutions to Optimization Problems
Elad Sharony, Heng Yang, Tong Che, Marco Pavone, Shie Mannor, Peter Karkus
TL;DR
This work tackles the challenge of solving sequences of similar optimization problems under strict runtime limits by learning multiple diverse initial solutions. It introduces Learning Multiple Initial Solutions (MISO), a single neural network that outputs $K$ candidate initializations and supports two usage modes: (i) a single optimizer that selects the best initial via a selection function $m{ abla}$, and (ii) multiple optimizers running in parallel with the best result chosen afterward. To foster multimodality, it proposes training objectives including pairwise distance loss, winner-takes-all loss, and a mixture loss, and it guarantees outperforming a default initialization by including that default among the predictions. The approach is validated on three optimal control benchmarks (cart-pole, reacher, autonomous driving) using three optimizers (DDP, MPPI, iLQR), showing significant and scalable improvements across settings; code is available at MISO. The results indicate that learning diverse initializations and leveraging parallel optimizers can markedly enhance reliability and efficiency of local optimization in dynamic, resource-constrained contexts.
Abstract
Sequentially solving similar optimization problems under strict runtime constraints is essential for many applications, such as robot control, autonomous driving, and portfolio management. The performance of local optimization methods in these settings is sensitive to the initial solution: poor initialization can lead to slow convergence or suboptimal solutions. To address this challenge, we propose learning to predict \emph{multiple} diverse initial solutions given parameters that define the problem instance. We introduce two strategies for utilizing multiple initial solutions: (i) a single-optimizer approach, where the most promising initial solution is chosen using a selection function, and (ii) a multiple-optimizers approach, where several optimizers, potentially run in parallel, are each initialized with a different solution, with the best solution chosen afterward. Notably, by including a default initialization among predicted ones, the cost of the final output is guaranteed to be equal or lower than with the default initialization. We validate our method on three optimal control benchmark tasks: cart-pole, reacher, and autonomous driving, using different optimizers: DDP, MPPI, and iLQR. We find significant and consistent improvement with our method across all evaluation settings and demonstrate that it efficiently scales with the number of initial solutions required. The code is available at MISO (https://github.com/EladSharony/miso).
