On efficiently computable functions, deep networks and sparse compositionality
Tomaso Poggio
TL;DR
The paper links efficient Turing computability with compositional sparsity by proving that any $f:[0,1]^d\to\mathbb{R}^m$ computable in time polynomial in the input precision induces a bounded-fan-in DAG representation and a correspondingly efficient neural approximant at fixed precision. It provides a precise discretization framework via $Q_n$ and $F_{n,m_{\mathrm{out}}}$, shows how a TM-to-circuit unrolling yields poly$(n+m_{\mathrm{out}})$-size networks, and demonstrates neural emulation of gates with error control to achieve $2^{-m_{\mathrm{out}}}$ accuracy. The work connects to compositional approximation theory and depth-efficiency results, illustrating that composition through low-arity, sparse structures enables favorable approximation rates and hierarchical optimization. It also bridges discrete Boolean circuits to smooth neural representations through a smooth lift, and discusses autoregressive universality as an associated learning-theoretic perspective. Overall, the results illuminate how efficient computation implies sparsity that underpins both expressive efficiency and optimization over sparse, hierarchical representations in neural approximants at finite precision.
Abstract
We show that \emph{efficient Turing computability} at any fixed input/output precision implies the existence of \emph{compositionally sparse} (bounded-fan-in, polynomial-size) DAG representations and of corresponding neural approximants achieving the target precision. Concretely: if $f:[0,1]^d\to\R^m$ is computable in time polynomial in the bit-depths, then for every pair of precisions $(n,m_{\mathrm{out}})$ there exists a bounded-fan-in Boolean circuit of size and depth $\poly(n+m_{\mathrm{out}})$ computing the discretized map; replacing each gate by a constant-size neural emulator yields a deep network of size/depth $\poly(n+m_{\mathrm{out}})$ that achieves accuracy $\varepsilon=2^{-m_{\mathrm{out}}}$. We also relate these constructions to compositional approximation rates \cite{MhaskarPoggio2016b,poggio_deep_shallow_2017,Poggio2017,Poggio2023HowDS} and to optimization viewed as hierarchical search over sparse structures.
