Table of Contents
Fetching ...

Mathematical Programming Models for Exact and Interpretable Formulation of Neural Networks

Masoud Ataei, Edrin Hasaj, Jacob Gipp, Sepideh Forouzi

TL;DR

This work addresses the challenge of training neural networks that are simultaneously accurate, sparse, and interpretable by introducing a unified mixed-integer linear programming framework. It provides exact formulations for both dense and convolutional architectures, encoding ReLU activations with binary indicators and supporting structural pruning at the layer, neuron, and channel levels, all within a single global optimization objective that blends data fit with elastic-net–style regularization and architectural penalties. The approach yields globally optimal solutions with respect to a composite objective and offers verifiability by construction, while enabling domain-specific constraints to be embedded linearly. Empirical results on small to moderate datasets demonstrate meaningful sparsity and competitive performance, illustrating the framework's potential as a certifiable alternative to gradient-based training for interpretable and verifiable neural networks.

Abstract

This paper presents a unified mixed-integer programming framework for training sparse and interpretable neural networks. We develop exact formulations for both fully connected and convolutional architectures by modeling nonlinearities such as ReLU activations through binary variables and encoding structural sparsity via filter- and layer-level pruning constraints. The resulting models integrate parameter learning, architecture selection, and structural regularization within a single optimization problem, yielding globally optimal solutions with respect to a composite objective that balances prediction accuracy, weight sparsity, and architectural compactness. The mixed-integer programming formulation accommodates piecewise-linear operations, including max pooling and activation gating, and permits precise enforcement of logic-based or domain-specific constraints. By incorporating considerations of interpretability, sparsity, and verifiability directly into the training process, the proposed framework bridges a range of research areas including explainable artificial intelligence, symbolic reasoning, and formal verification.

Mathematical Programming Models for Exact and Interpretable Formulation of Neural Networks

TL;DR

This work addresses the challenge of training neural networks that are simultaneously accurate, sparse, and interpretable by introducing a unified mixed-integer linear programming framework. It provides exact formulations for both dense and convolutional architectures, encoding ReLU activations with binary indicators and supporting structural pruning at the layer, neuron, and channel levels, all within a single global optimization objective that blends data fit with elastic-net–style regularization and architectural penalties. The approach yields globally optimal solutions with respect to a composite objective and offers verifiability by construction, while enabling domain-specific constraints to be embedded linearly. Empirical results on small to moderate datasets demonstrate meaningful sparsity and competitive performance, illustrating the framework's potential as a certifiable alternative to gradient-based training for interpretable and verifiable neural networks.

Abstract

This paper presents a unified mixed-integer programming framework for training sparse and interpretable neural networks. We develop exact formulations for both fully connected and convolutional architectures by modeling nonlinearities such as ReLU activations through binary variables and encoding structural sparsity via filter- and layer-level pruning constraints. The resulting models integrate parameter learning, architecture selection, and structural regularization within a single optimization problem, yielding globally optimal solutions with respect to a composite objective that balances prediction accuracy, weight sparsity, and architectural compactness. The mixed-integer programming formulation accommodates piecewise-linear operations, including max pooling and activation gating, and permits precise enforcement of logic-based or domain-specific constraints. By incorporating considerations of interpretability, sparsity, and verifiability directly into the training process, the proposed framework bridges a range of research areas including explainable artificial intelligence, symbolic reasoning, and formal verification.

Paper Structure

This paper contains 5 sections, 32 equations, 2 tables.