An analysis of optimization problems involving ReLU neural networks

Christoph Plate; Mirko Hahn; Alexander Klimek; Caroline Ganzer; Kai Sundmacher; Sebastian Sager

An analysis of optimization problems involving ReLU neural networks

Christoph Plate, Mirko Hahn, Alexander Klimek, Caroline Ganzer, Kai Sundmacher, Sebastian Sager

TL;DR

Embedding ReLU networks into MINLP introduces large big-$M$ constants that hinder solver performance. The paper surveys and quantifies strategies including bound tightening (IA and LP-based), a posteriori ReLU scaling, training-time regularization, clipped ReLU, and dropout, demonstrating that LP-based tightening and scaling can substantially reduce big-$M$ and improve runtimes, with regularization during training providing the strongest overall gains by reducing linear regions and increasing stability. The results show a practical trade-off between neural redundancy and optimization cost, and offer actionable guidance for designing surrogates and preprocessing steps to accelerate solving embedded optimization problems. Collectively, the work informs better integration of ReLU surrogates in engineering MINLPs and suggests directions for extending these insights to more complex models and problem classes.

Abstract

Solving mixed-integer optimization problems with embedded neural networks with ReLU activation functions is challenging. Big-M coefficients that arise in relaxing binary decisions related to these functions grow exponentially with the number of layers. We survey and propose different approaches to analyze and improve the run time behavior of mixed-integer programming solvers in this context. Among them are clipped variants and regularization techniques applied during training as well as optimization-based bound tightening and a novel scaling for given ReLU networks. We numerically compare these approaches for three benchmark problems from the literature. We use the number of linear regions, the percentage of stable neurons, and overall computational effort as indicators. As a major takeaway we observe and quantify a trade-off between the often desired redundancy of neural network models versus the computational costs for solving related optimization problems.

An analysis of optimization problems involving ReLU neural networks

TL;DR

Embedding ReLU networks into MINLP introduces large big-

constants that hinder solver performance. The paper surveys and quantifies strategies including bound tightening (IA and LP-based), a posteriori ReLU scaling, training-time regularization, clipped ReLU, and dropout, demonstrating that LP-based tightening and scaling can substantially reduce big-

and improve runtimes, with regularization during training providing the strongest overall gains by reducing linear regions and increasing stability. The results show a practical trade-off between neural redundancy and optimization cost, and offer actionable guidance for designing surrogates and preprocessing steps to accelerate solving embedded optimization problems. Collectively, the work informs better integration of ReLU surrogates in engineering MINLPs and suggests directions for extending these insights to more complex models and problem classes.

An analysis of optimization problems involving ReLU neural networks

TL;DR

Abstract

An analysis of optimization problems involving ReLU neural networks

TL;DR

Abstract

Paper Structure

Table of Contents

Figures (9)