Table of Contents
Fetching ...

ReLU Neural Networks of Polynomial Size for Exact Maximum Flow Computation

Christoph Hertrich, Leon Sering

TL;DR

The paper investigates the exact expressivity of ReLU neural networks as a real-valued computation model by introducing Max-Affine Arithmetic Programs (MAAPs) and proving their equivalence to NNs in depth, width, and size. It then uses this framework to construct polynomial-size networks that solve two classic combinatorial optimization problems: the minimum spanning tree value for undirected graphs with $n$ nodes via a network of size $O(n^3)$ and depth $O(n\log n)$, and the maximum $s$-$t$ flow for directed graphs with $n$ nodes and $m$ arcs via a network of size $O(m^2n^2)$ and width $O(1)$. These results show that such problems admit strongly polynomial-time algorithms built from affine transformations and maxima, without branching. The authors also discuss implications for learning theory, Boolean and tropical circuit models, and parallel computation, and they outline future research directions, including potential lower bounds and extensions to additional CO problems. Overall, the work advances the understanding of how polynomial-size ReLU networks can represent exact solutions to key optimization problems and how MAAPs can serve as a practical bridge between algorithm design and neural-network constructions.

Abstract

This paper studies the expressive power of artificial neural networks with rectified linear units. In order to study them as a model of real-valued computation, we introduce the concept of Max-Affine Arithmetic Programs and show equivalence between them and neural networks concerning natural complexity measures. We then use this result to show that two fundamental combinatorial optimization problems can be solved with polynomial-size neural networks. First, we show that for any undirected graph with $n$ nodes, there is a neural network (with fixed weights and biases) of size $\mathcal{O}(n^3)$ that takes the edge weights as input and computes the value of a minimum spanning tree of the graph. Second, we show that for any directed graph with $n$ nodes and $m$ arcs, there is a neural network of size $\mathcal{O}(m^2n^2)$ that takes the arc capacities as input and computes a maximum flow. Our results imply that these two problems can be solved with strongly polynomial time algorithms that solely use affine transformations and maxima computations, but no comparison-based branchings.

ReLU Neural Networks of Polynomial Size for Exact Maximum Flow Computation

TL;DR

The paper investigates the exact expressivity of ReLU neural networks as a real-valued computation model by introducing Max-Affine Arithmetic Programs (MAAPs) and proving their equivalence to NNs in depth, width, and size. It then uses this framework to construct polynomial-size networks that solve two classic combinatorial optimization problems: the minimum spanning tree value for undirected graphs with nodes via a network of size and depth , and the maximum - flow for directed graphs with nodes and arcs via a network of size and width . These results show that such problems admit strongly polynomial-time algorithms built from affine transformations and maxima, without branching. The authors also discuss implications for learning theory, Boolean and tropical circuit models, and parallel computation, and they outline future research directions, including potential lower bounds and extensions to additional CO problems. Overall, the work advances the understanding of how polynomial-size ReLU networks can represent exact solutions to key optimization problems and how MAAPs can serve as a practical bridge between algorithm design and neural-network constructions.

Abstract

This paper studies the expressive power of artificial neural networks with rectified linear units. In order to study them as a model of real-valued computation, we introduce the concept of Max-Affine Arithmetic Programs and show equivalence between them and neural networks concerning natural complexity measures. We then use this result to show that two fundamental combinatorial optimization problems can be solved with polynomial-size neural networks. First, we show that for any undirected graph with nodes, there is a neural network (with fixed weights and biases) of size that takes the edge weights as input and computes the value of a minimum spanning tree of the graph. Second, we show that for any directed graph with nodes and arcs, there is a neural network of size that takes the arc capacities as input and computes a maximum flow. Our results imply that these two problems can be solved with strongly polynomial time algorithms that solely use affine transformations and maxima computations, but no comparison-based branchings.

Paper Structure

This paper contains 25 sections, 10 theorems, 4 equations, 3 figures.

Key Result

Theorem 1

For a fixed graph with $n$ vertices, there exists an NN of depth $\mathcal{O}(n\log n)$, width $\mathcal{O}(n^2)$, and size $\mathcal{O}(n^3)$ that correctly maps a vector of edge weights to the value of a minimum spanning tree.

Figures (3)

  • Figure 1: A small NN with two input neurons $\bm x_1$ and $\bm x_2$, a single ReLU neuron labelled with the shape of the ReLU function, and one output neuron $\bm y$. It computes the function \bm x\mapsto \bm y= \bm x_2-\max\set{0,\bm x_2-\bm x_1}=-\max\set{-\bm x_2,-\bm x_1}=\min\set{\bm x_1,\bm x_2}.
  • Figure 2: This example shows that the outcome of one iteration of the Edmonds-Karp algorithm for computing a maximum flow depends discontinuously on the arc capacities. Here, a small adjustment of the capacity of arc $st$ leads to a drastic change of the flow after the first iteration.
  • Figure 3: Example of the FindAugmentingFlow$_k$ subroutine for $k=4$. The edge labels in the top figure are the residual capacity bounds in the current iteration. The first step is to compute the fattest path values $\bm a_{i,v}$, which are depicted as node labels in the top figure. The values $\bm Y_v^i$ always denote the excessive flow of a vertex $v$ with distance $i$ from the sink. All values that are not displayed are zero. At $s$, we initialize $\bm Y_s^4=\bm a_{4,s}=6$. Then, excessive flow is pushed greedily towards the sink, as shown in the four figures in the middle. While doing so, we ensure that at each vertex the arriving flow does not exceed its value $\bm a_{i,v}$. For this reason, flow can get stuck, as it happens at $v_4$ in this example. Therefore, in a final cleanup phase, depicted in the two bottom figures, we push flow back to the source $s$. Observe that the result is an $s$-$t$-flow that is feasible with respect to the residual capacities, uses only paths of length $k=4$, and saturates the arc $v_6t$.

Theorems & Definitions (26)

  • Theorem 1
  • Theorem 2
  • Proposition 2
  • Proposition 2
  • proof
  • Proposition 2
  • proof
  • Theorem 2
  • proof
  • Theorem 3
  • ...and 16 more