Table of Contents
Fetching ...

Partially-Supervised Neural Network Model For Quadratic Multiparametric Programming

Fuat Can Beylunioglu, Mehrdad Pirnia, P. Robert Duimering

TL;DR

This work addresses predicting solutions to multiparametric quadratic programs with linear constraints by introducing a partially-supervised neural network (PSNN) that mirrors the problem's piecewise-linear solution structure. By analytically deriving a large portion of the network weights from the Lagrangian and critical regions, PSNN achieves exact or near-exact KKT-feasible predictions with far less training data than conventional deep networks, demonstrated on DC-$OPF$ in energy systems. The method combines a fixed-first-layer encoding of active-set slopes with a learned mapping informed by the inverse Jacobian of the KKT system, and uses a solver-free data-generation strategy guided by critical-region discovery to populate training data efficiently. Empirically, PSNN attains superior accuracy and robustness to out-of-distribution tests compared with standard NNs, while offering drastic speed advantages over commercial solvers, enabling rapid generation of large-scale feasible-optimal solution distributions for planning under uncertainty.

Abstract

Neural Networks (NN) with ReLU activation functions are used to model multiparametric quadratic optimization problems (mp-QP) in diverse engineering applications. Researchers have suggested leveraging the piecewise affine property of deep NN models to solve mp-QP with linear constraints, which also exhibit piecewise affine behaviour. However, traditional deep NN applications to mp-QP fall short of providing optimal and feasible predictions, even when trained on large datasets. This study proposes a partially-supervised NN (PSNN) architecture that directly represents the mathematical structure of the global solution function. In contrast to generic NN training approaches, the proposed PSNN method derives a large proportion of model weights directly from the mathematical properties of the optimization problem, producing more accurate solutions despite significantly smaller training data sets. Many energy management problems are formulated as QP, so we apply the proposed approach to energy systems (specifically DC optimal power flow) to demonstrate proof of concept. Model performance in terms of solution accuracy and speed of predictions was compared against a commercial solver and a generic Deep NN model based on classical training. Results show KKT sufficient conditions for PSNN consistently outperform generic NN architectures with classical training using far less data, including when tested on extreme, out-of-training distribution test data. Given its speed advantages over traditional solvers, the PSNN model can quickly produce optimal and feasible solutions within a second for millions of input parameters sampled from a distribution of stochastic demands and renewable generator dispatches, which can be used for simulations and long term planning.

Partially-Supervised Neural Network Model For Quadratic Multiparametric Programming

TL;DR

This work addresses predicting solutions to multiparametric quadratic programs with linear constraints by introducing a partially-supervised neural network (PSNN) that mirrors the problem's piecewise-linear solution structure. By analytically deriving a large portion of the network weights from the Lagrangian and critical regions, PSNN achieves exact or near-exact KKT-feasible predictions with far less training data than conventional deep networks, demonstrated on DC- in energy systems. The method combines a fixed-first-layer encoding of active-set slopes with a learned mapping informed by the inverse Jacobian of the KKT system, and uses a solver-free data-generation strategy guided by critical-region discovery to populate training data efficiently. Empirically, PSNN attains superior accuracy and robustness to out-of-distribution tests compared with standard NNs, while offering drastic speed advantages over commercial solvers, enabling rapid generation of large-scale feasible-optimal solution distributions for planning under uncertainty.

Abstract

Neural Networks (NN) with ReLU activation functions are used to model multiparametric quadratic optimization problems (mp-QP) in diverse engineering applications. Researchers have suggested leveraging the piecewise affine property of deep NN models to solve mp-QP with linear constraints, which also exhibit piecewise affine behaviour. However, traditional deep NN applications to mp-QP fall short of providing optimal and feasible predictions, even when trained on large datasets. This study proposes a partially-supervised NN (PSNN) architecture that directly represents the mathematical structure of the global solution function. In contrast to generic NN training approaches, the proposed PSNN method derives a large proportion of model weights directly from the mathematical properties of the optimization problem, producing more accurate solutions despite significantly smaller training data sets. Many energy management problems are formulated as QP, so we apply the proposed approach to energy systems (specifically DC optimal power flow) to demonstrate proof of concept. Model performance in terms of solution accuracy and speed of predictions was compared against a commercial solver and a generic Deep NN model based on classical training. Results show KKT sufficient conditions for PSNN consistently outperform generic NN architectures with classical training using far less data, including when tested on extreme, out-of-training distribution test data. Given its speed advantages over traditional solvers, the PSNN model can quickly produce optimal and feasible solutions within a second for millions of input parameters sampled from a distribution of stochastic demands and renewable generator dispatches, which can be used for simulations and long term planning.

Paper Structure

This paper contains 22 sections, 2 theorems, 28 equations, 14 figures, 4 tables, 3 algorithms.

Key Result

Theorem 1

For the mp-QP problem eq:ineq, $\Theta_f \subseteq \Theta$ is a convex set, the primal solution function x$(\boldsymbol{\theta}): \Theta_f \rightarrow \mathbb{R}^n$ is continuous and piecewise affine. Also the optimal objective function $\textbf{z}(\boldsymbol{\theta}):\Theta_f\rightarrow \mathbb{R}

Figures (14)

  • Figure 1: Illustration of Shallow (left) and Deep (right) NN models
  • Figure 2: The tradeoff between dataset size and model complexity in estimating arbitrary PWL functions
  • Figure 3: Sensitivity of $\boldsymbol{\mu}$ with respect to $\theta_1$
  • Figure 4: NN Model to Predict $\boldsymbol{\mu}^*$
  • Figure 6: Subnetwork predicting $\boldsymbol{\mu}^*$
  • ...and 9 more figures

Theorems & Definitions (2)

  • Theorem 1
  • Lemma 1