Size and depth of monotone neural networks: interpolation and approximation

Dan Mikulincer; Daniel Reichman

Size and depth of monotone neural networks: interpolation and approximation

Dan Mikulincer, Daniel Reichman

TL;DR

This work studies monotone neural networks with nonnegative weights, focusing on their expressive power and interpolation abilities for monotone functions on $[0,1]^d$. It proves that depth-4 monotone threshold networks universally approximate monotone functions, improving over prior depth bounds for $d>3$, and shows a near-tight depth-based interpolation construction with a wide intermediate layer. The paper also establishes an exponential separation between the size required by monotone networks and general threshold networks, via harmonic extension and circuit reductions, leveraging the Tardos function to connect to monotone circuit lower bounds. These results illuminate fundamental trade-offs between monotonicity constraints and representation efficiency, with implications for designing monotone models and for understanding connections between neural networks and circuit complexity.

Abstract

We study monotone neural networks with threshold gates where all the weights (other than the biases) are non-negative. We focus on the expressive power and efficiency of representation of such networks. Our first result establishes that every monotone function over $[0,1]^d$ can be approximated within arbitrarily small additive error by a depth-4 monotone network. When $d > 3$, we improve upon the previous best-known construction which has depth $d+1$. Our proof goes by solving the monotone interpolation problem for monotone datasets using a depth-4 monotone threshold network. In our second main result we compare size bounds between monotone and arbitrary neural networks with threshold gates. We find that there are monotone real functions that can be computed efficiently by networks with no restriction on the gates whereas monotone networks approximating these functions need exponential size in the dimension.

Size and depth of monotone neural networks: interpolation and approximation

TL;DR

This work studies monotone neural networks with nonnegative weights, focusing on their expressive power and interpolation abilities for monotone functions on

. It proves that depth-4 monotone threshold networks universally approximate monotone functions, improving over prior depth bounds for

, and shows a near-tight depth-based interpolation construction with a wide intermediate layer. The paper also establishes an exponential separation between the size required by monotone networks and general threshold networks, via harmonic extension and circuit reductions, leveraging the Tardos function to connect to monotone circuit lower bounds. These results illuminate fundamental trade-offs between monotonicity constraints and representation efficiency, with implications for designing monotone models and for understanding connections between neural networks and circuit complexity.

Abstract

can be approximated within arbitrarily small additive error by a depth-4 monotone network. When

, we improve upon the previous best-known construction which has depth

. Our proof goes by solving the monotone interpolation problem for monotone datasets using a depth-4 monotone threshold network. In our second main result we compare size bounds between monotone and arbitrary neural networks with threshold gates. We find that there are monotone real functions that can be computed efficiently by networks with no restriction on the gates whereas monotone networks approximating these functions need exponential size in the dimension.

Paper Structure (25 sections, 17 theorems, 56 equations, 1 figure, 1 table)

This paper contains 25 sections, 17 theorems, 56 equations, 1 figure, 1 table.

Introduction
Our contributions
On expressive power and interpolation
Efficiency when compared to general networks
Related work
Preliminaries and notation
A counter-example to expressibility
Four layers suffice with threshold activation
First hidden layer
Second hidden layer
The third hidden layer
The final layer
Interpolating monotone networks are wide
Universal approximation
An exponential separation between the size of monotone and arbitrary threshold networks
...and 10 more sections

Key Result

Lemma 1

There exists a monotone function $f:[0,1] \to \mathbb{R}$ and a constant $c > 0$, such that for any monotone network $N$ with ReLU gates, there exists $x \in [0,1]$, such that

Figures (1)

Figure 1: Plots of the monotone interpolating networks with labels taken from the function $f(x) = \|x\|_2$ on the left, and $f(x) = e^{\|x\|_2}$ on the right. The plots contain interpolations over three grids: (i) $10 \times 10$ grid, (ii) $20\times20$ grid, (iii) $30\times 30$ grid, and (iv) contains a plot of the original function.

Theorems & Definitions (36)

Lemma 1
proof
Lemma 2
Theorem 3
Lemma 4
Theorem 5
Theorem 6
proof : Proof of Lemma \ref{['lem:impossible']}
Lemma 7
proof
...and 26 more

Size and depth of monotone neural networks: interpolation and approximation

TL;DR

Abstract

Size and depth of monotone neural networks: interpolation and approximation

Authors

TL;DR

Abstract

Table of Contents

Key Result

Figures (1)

Theorems & Definitions (36)