Learning to Optimize for Mixed-Integer Non-linear Programming with Feasibility Guarantees
Bo Tang, Elias B. Khalil, Ján Drgoňa
TL;DR
The paper addresses scalable solution of parametric MINLPs by introducing a solver-free L2O framework that directly predicts mixed-integer decisions. It introduces differentiable integer-correction layers and an inference-time integer feasibility projection, with theoretical convergence guarantees for the projection under mild regularity. Empirically, the approach scales to tens of thousands of variables and delivers high-quality, feasible solutions in milliseconds, often outperforming traditional solvers and heuristics in repeated-solve settings. Ablations and analyses demonstrate the necessity of the correction layers and the projection, and show favorable training times, highlighting the method's practicality for real-time or large-scale deployment.
Abstract
Mixed-integer nonlinear programs (MINLPs) arise in domains such as energy systems, process engineering, and transportation, and are notoriously difficult to solve at scale due to the interplay of discrete decisions and nonlinear constraints. In many practical settings, these problems appear in parametric form, where objectives and constraints depend on instance-specific parameters, creating the need for fast and reliable solutions across related instances. While learning-to-optimize (L2O) methods have shown strong performance in continuous optimization, extending them to MINLPs requires enforcing both feasibility and integrality within a data-driven framework. We propose an L2O approach tailored to parametric MINLPs that generates instance-specific solutions using integer correction layers to enforce integrality and a gradient-based projection to ensure feasibility of the inequality constraints. Theoretically, we provide asymptotic and non-asymptotic convergence guarantees of the projection step. Empirically, the framework scales to MINLPs with tens of thousands of variables and produces feasible high-quality solutions within milliseconds, often outperforming traditional solvers and heuristic baselines in repeated-solve settings.
