Feed-Forward Neural Networks as a Mixed-Integer Program

Navid Aftabi; Nima Moradi; Fatemeh Mahroo

Feed-Forward Neural Networks as a Mixed-Integer Program

Navid Aftabi, Nima Moradi, Fatemeh Mahroo

TL;DR

This paper investigates modeling feed-forward DNNs with ReLU activations as mixed-integer programs (MIPs) to enable formal analysis, verification, and adversarial input generation. It compares multiple ReLU encodings—Big-M, extended, and disjunctive—and pooling formulations to capture the piecewise-linear structure of FF-DNNs, while also addressing training via MIP for small networks and discrete-activation variants (Binary FF-DNNs and Binarized Neural Networks). Using MNIST-based experiments, the authors show that extended and disjunctive formulations offer better tractability for smaller networks, whereas the convex-hull and Big-M approaches can be prohibitive as model size grows. The work demonstrates the potential and limitations of MIP-based methods for NN analysis and prescriptive tasks, highlighting the need for scalable bound propagation and robust computing resources for larger architectures.

Abstract

Deep neural networks (DNNs) are widely studied in various applications. A DNN consists of layers of neurons that compute affine combinations, apply nonlinear operations, and produce corresponding activations. The rectified linear unit (ReLU) is a typical nonlinear operator, outputting the max of its input and zero. In scenarios like max pooling, where multiple input values are involved, a fixed-parameter DNN can be modeled as a mixed-integer program (MIP). This formulation, with continuous variables representing unit outputs and binary variables for ReLU activation, finds applications across diverse domains. This study explores the formulation of trained ReLU neurons as MIP and applies MIP models for training neural networks (NNs). Specifically, it investigates interactions between MIP techniques and various NN architectures, including binary DNNs (employing step activation functions) and binarized DNNs (with weights and activations limited to $-1,0,+1$). The research focuses on training and evaluating proposed approaches through experiments on handwritten digit classification models. The comparative study assesses the performance of trained ReLU NNs, shedding light on the effectiveness of MIP formulations in enhancing training processes for NNs.

Feed-Forward Neural Networks as a Mixed-Integer Program

TL;DR

Abstract

). The research focuses on training and evaluating proposed approaches through experiments on handwritten digit classification models. The comparative study assesses the performance of trained ReLU NNs, shedding light on the effectiveness of MIP formulations in enhancing training processes for NNs.

Paper Structure (21 sections, 46 equations, 2 figures, 5 tables)

This paper contains 21 sections, 46 equations, 2 figures, 5 tables.

Introduction
Background and related works
Artificial Neural Networks
Neurons of ANNs
Organization of ANNs
Training Artificial Neural Networks
Trained Artificial Neural Networks
FF-DNNs as MIP
Trained DNNs as MIP
Big-M formulation for ReLU
Extended formulation for ReLU
Disjunctive Programming for ReLU
Pooling layer formulations
Training DNNs by solving a MIP
Binary FF-DNNs
...and 6 more sections

Figures (2)

Figure 1: Single perceptron mathematical model goodfellow2016deep
Figure 2: A simple illustration of FF-DNNs, e.g., CNNs

Feed-Forward Neural Networks as a Mixed-Integer Program

TL;DR

Abstract

Feed-Forward Neural Networks as a Mixed-Integer Program

Authors

TL;DR

Abstract

Table of Contents

Figures (2)