Expressive Power of ReLU and Step Networks under Floating-Point Operations
Yeachan Park, Geonho Hwang, Wonyeol Lee, Sejun Park
TL;DR
This work analyzes the expressive power of neural networks when computations are performed in floating-point arithmetic, addressing a key gap between theory and practice. By formulating two FP models, $\,\mathbb{F}_p\,$ (unbounded exponent) and $\,\mathbb{F}_{p,q}\,$ (bounded exponent), it proves that Step and ReLU networks can memorize arbitrary finite datasets and universally approximate continuous functions under both regimes, with parameter counts matching classical real-analytic counterparts up to constants. The results cover both idealized and practical floating-point formats, including IEEE 754 variants and bf16, and quantify intrinsic FP representation errors that bound achievable accuracy. The proofs rely on a set of FP-specific lemmas to control rounding, overflow, and exactness of arithmetic within network constructions. Overall, the paper bridges theory and practice by showing that standard FP formats retain the same asymptotic expressivity guarantees as real-precision models, thereby informing the design and analysis of real-world neural networks run on hardware.
Abstract
The study of the expressive power of neural networks has investigated the fundamental limits of neural networks. Most existing results assume real-valued inputs and parameters as well as exact operations during the evaluation of neural networks. However, neural networks are typically executed on computers that can only represent a tiny subset of the reals and apply inexact operations, i.e., most existing results do not apply to neural networks used in practice. In this work, we analyze the expressive power of neural networks under a more realistic setup: when we use floating-point numbers and operations as in practice. Our first set of results assumes floating-point operations where the significand of a float is represented by finite bits but its exponent can take any integer value. Under this setup, we show that neural networks using a binary threshold unit or ReLU can memorize any finite input/output pairs and can approximate any continuous function within an arbitrary error. In particular, the number of parameters in our constructions for universal approximation and memorization coincides with that in classical results assuming exact mathematical operations. We also show similar results on memorization and universal approximation when floating-point operations use finite bits for both significand and exponent; these results are applicable to many popular floating-point formats such as those defined in the IEEE 754 standard (e.g., 32-bit single-precision format) and bfloat16.
