Table of Contents
Fetching ...

Optimization over Trained Neural Networks: Difference-of-Convex Algorithm and Application to Data Center Scheduling

Xinwei Liu, Vladimir Dvorkin

TL;DR

The paper tackles optimization problems where objectives or constraints are learned by neural networks, focusing on ReLU-based networks. It introduces a bilinear reformulation that penalizes ReLU violations in the objective and solves the resulting problem via a difference-of-convex (DC) algorithm, with a principled method to select the penalty parameter $ ho$. The approach is applied to data-center demand allocation in power grids, replacing a difficult bilevel/OPF model with an NN-embedded optimization that yields significant potential savings, demonstrated on small and large test systems. The results show effective convergence to stationary points and practical improvements in electricity costs under congested grid conditions. The work provides a practical framework for optimization over trained neural networks with strong potential impact on grid operations and other decision-making problems leveraging data-driven cost models.

Abstract

When solving decision-making problems with mathematical optimization, some constraints or objectives may lack analytic expressions but can be approximated from data. When an approximation is made by neural networks, the underlying problem becomes optimization over trained neural networks. Despite recent improvements with cutting planes, relaxations, and heuristics, the problem remains difficult to solve in practice. We propose a new solution based on a bilinear problem reformulation that penalizes ReLU constraints in the objective function. This reformulation makes the problem amenable to efficient difference-of-convex algorithms (DCA), for which we propose a principled approach to penalty selection that facilitates convergence to stationary points of the original problem. We apply the DCA to the problem of the least-cost allocation of data center electricity demand in a power grid, reporting significant savings in congested cases.

Optimization over Trained Neural Networks: Difference-of-Convex Algorithm and Application to Data Center Scheduling

TL;DR

The paper tackles optimization problems where objectives or constraints are learned by neural networks, focusing on ReLU-based networks. It introduces a bilinear reformulation that penalizes ReLU violations in the objective and solves the resulting problem via a difference-of-convex (DC) algorithm, with a principled method to select the penalty parameter . The approach is applied to data-center demand allocation in power grids, replacing a difficult bilevel/OPF model with an NN-embedded optimization that yields significant potential savings, demonstrated on small and large test systems. The results show effective convergence to stationary points and practical improvements in electricity costs under congested grid conditions. The work provides a practical framework for optimization over trained neural networks with strong potential impact on grid operations and other decision-making problems leveraging data-driven cost models.

Abstract

When solving decision-making problems with mathematical optimization, some constraints or objectives may lack analytic expressions but can be approximated from data. When an approximation is made by neural networks, the underlying problem becomes optimization over trained neural networks. Despite recent improvements with cutting planes, relaxations, and heuristics, the problem remains difficult to solve in practice. We propose a new solution based on a bilinear problem reformulation that penalizes ReLU constraints in the objective function. This reformulation makes the problem amenable to efficient difference-of-convex algorithms (DCA), for which we propose a principled approach to penalty selection that facilitates convergence to stationary points of the original problem. We apply the DCA to the problem of the least-cost allocation of data center electricity demand in a power grid, reporting significant savings in congested cases.

Paper Structure

This paper contains 12 sections, 19 equations, 5 figures, 2 algorithms.

Figures (5)

  • Figure 1: Relationships between stationary points of problems in Sec. \ref{['subsec:rho']}. Such points satisfy the same set of KKT conditions jara2018study.
  • Figure 2: 5-bus PJM test case: DCA trajectories for varying penalty $\rho$. The total demand charge (cost of electricity) is normalized to the ground truth.
  • Figure 3: Total charges for data center electricity consumption on the baseline and DCA allocations across 47 simulations in IEEE 118-bus system.
  • Figure 4: DCA convergence in the experiments with the IEEE 118-bus system. The green thin lines indicates the actual trajectories using nominal penalty $\rho^\star$. The red thin lines indicates the convergence trajectories using $5\times\rho^\star$ penalty. The thick lines are the average trajectories.
  • Figure 5: The difference in LMPs induced by the baseline and DCA demand allocations in the IEEE 118-bus system. The red marks indicate buses hosting data centers. For the majority of these buses, the LMPs are significantly reduced under the DCA demand allocation compared to the baseline. Case $\#28$.

Theorems & Definitions (1)

  • Definition 1: Strongly stationary point