Expressive Power of ReLU and Step Networks under Floating-Point Operations

Yeachan Park; Geonho Hwang; Wonyeol Lee; Sejun Park

Expressive Power of ReLU and Step Networks under Floating-Point Operations

Yeachan Park, Geonho Hwang, Wonyeol Lee, Sejun Park

TL;DR

This work analyzes the expressive power of neural networks when computations are performed in floating-point arithmetic, addressing a key gap between theory and practice. By formulating two FP models, $\,\mathbb{F}_p\,$ (unbounded exponent) and $\,\mathbb{F}_{p,q}\,$ (bounded exponent), it proves that Step and ReLU networks can memorize arbitrary finite datasets and universally approximate continuous functions under both regimes, with parameter counts matching classical real-analytic counterparts up to constants. The results cover both idealized and practical floating-point formats, including IEEE 754 variants and bf16, and quantify intrinsic FP representation errors that bound achievable accuracy. The proofs rely on a set of FP-specific lemmas to control rounding, overflow, and exactness of arithmetic within network constructions. Overall, the paper bridges theory and practice by showing that standard FP formats retain the same asymptotic expressivity guarantees as real-precision models, thereby informing the design and analysis of real-world neural networks run on hardware.

Abstract

The study of the expressive power of neural networks has investigated the fundamental limits of neural networks. Most existing results assume real-valued inputs and parameters as well as exact operations during the evaluation of neural networks. However, neural networks are typically executed on computers that can only represent a tiny subset of the reals and apply inexact operations, i.e., most existing results do not apply to neural networks used in practice. In this work, we analyze the expressive power of neural networks under a more realistic setup: when we use floating-point numbers and operations as in practice. Our first set of results assumes floating-point operations where the significand of a float is represented by finite bits but its exponent can take any integer value. Under this setup, we show that neural networks using a binary threshold unit or ReLU can memorize any finite input/output pairs and can approximate any continuous function within an arbitrary error. In particular, the number of parameters in our constructions for universal approximation and memorization coincides with that in classical results assuming exact mathematical operations. We also show similar results on memorization and universal approximation when floating-point operations use finite bits for both significand and exponent; these results are applicable to many popular floating-point formats such as those defined in the IEEE 754 standard (e.g., 32-bit single-precision format) and bfloat16.

Expressive Power of ReLU and Step Networks under Floating-Point Operations

TL;DR

(unbounded exponent) and

(bounded exponent), it proves that Step and ReLU networks can memorize arbitrary finite datasets and universally approximate continuous functions under both regimes, with parameter counts matching classical real-analytic counterparts up to constants. The results cover both idealized and practical floating-point formats, including IEEE 754 variants and bf16, and quantify intrinsic FP representation errors that bound achievable accuracy. The proofs rely on a set of FP-specific lemmas to control rounding, overflow, and exactness of arithmetic within network constructions. Overall, the paper bridges theory and practice by showing that standard FP formats retain the same asymptotic expressivity guarantees as real-precision models, thereby informing the design and analysis of real-world neural networks run on hardware.

Abstract

Paper Structure (40 sections, 24 theorems, 209 equations, 1 figure)

This paper contains 40 sections, 24 theorems, 209 equations, 1 figure.

Introduction
Summary of contribution
Organization
Problem setup and notations
Notations
Floating-point numbers
Floating-point operations
Neural networks
Memorization
Universal approximation
Expressive power of neural networks under $\mathbb{F}_p$
$\mathrm{Step}$ network results
$\mathrm{ReLU}$ network results
Expressive power of neural networks under $\mathbb{F}_{p,q}$
$\mathrm{Step}$ network results
...and 25 more sections

Key Result

Lemma 1

For any $p\in\mathbb{N}$ and $d\in[2^{p}]$, there exists a $\mathrm{Step}$ network $f_{\theta}(\,\cdot\,;\mathbb{F}_p):\mathbb{F}_p^d\to\mathbb{F}_p$ of $3$ layers and $6d+2$ parameters that satisfies the following: for any $\alpha=(\alpha_1,\dots,\alpha_d),\beta=(\beta_1,\dots,\beta_d)\in\mathbb{F} for all $\mathbf{x}\in[0,1]^d$.

Figures (1)

Figure 1: An illustration of a neural network is presented in \ref{['eq:def-nn']}. In this example, we set the parameters as follows: $d=5$, $L=3$, and $N_1=N_2=3$.

Theorems & Definitions (39)

Lemma 1
Theorem 2
Theorem 3
Corollary 4
Theorem 5
Theorem 6
Corollary 7
Theorem 8
Theorem 9
Theorem 10
...and 29 more

Expressive Power of ReLU and Step Networks under Floating-Point Operations

TL;DR

Abstract

Expressive Power of ReLU and Step Networks under Floating-Point Operations

Authors

TL;DR

Abstract

Table of Contents

Key Result

Figures (1)

Theorems & Definitions (39)