A Quantifier-Reversal Approximation Paradigm for Recurrent Neural Networks
Clemens Hutter, Valentin Abadie, Helmut Bölcskei
TL;DR
This work proposes a quantifier-reversal paradigm for neural function approximation using recurrent networks, showing that a single fixed RNN with fixed weights can approximate any target function $f$ to arbitrary accuracy by running longer in time, with the error decaying exponentially as $4^{-Ct}$. The authors develop an explicit RNN calculus, including clocked concatenation and weight-sharing, to emulate the depth-driven composition of classical deep networks, enabling function composition and polynomial construction within one architecture. For univariate polynomials, they build squaring and multiplication nets, organize monomials in a pyramid, and assemble a polynomial-approximation RNN whose error decays as $\|a\|_1 C_1 4^{-C_2 t}$ and whose hidden state scales linearly with degree (e.g., $80N+11$ for degree $N$). The results highlight memory-efficient, runtime-scalable hardware implementations and open avenues for extensions to multivariate polynomials, general smooth functions, and low-precision weight regimes.
Abstract
Classical neural network approximation results take the form: for every function $f$ and every error tolerance $ε> 0$, one constructs a neural network whose architecture and weights depend on $ε$. This paper introduces a fundamentally different approximation paradigm that reverses this quantifier order. For each target function $f$, we construct a single recurrent neural network (RNN) with fixed topology and fixed weights that approximates $f$ to within any prescribed tolerance $ε> 0$ when run for sufficiently many time steps. The key mechanism enabling this quantifier reversal is temporal computation combined with weight sharing: rather than increasing network depth, the approximation error is reduced solely by running the RNN longer. This yields exponentially decaying approximation error as a function of runtime while requiring storage of only a small, fixed set of weights. Such architectures are appealing for hardware implementations where memory is scarce and runtime is comparatively inexpensive. To initiate the systematic development of this novel approximation paradigm, we focus on univariate polynomials. Our RNN constructions emulate the structural calculus underlying deep feed-forward ReLU network approximation theory -- parallelization, linear combinations, affine transformations, and, most importantly, a clocked mechanism that realizes function composition within a single recurrent architecture. The resulting RNNs have size independent of the error tolerance $ε$ and hidden-state dimension linear in the degree of the polynomial.
