Table of Contents
Fetching ...

Vocabulary for Universal Approximation: A Linguistic Perspective of Mapping Compositions

Yongqiang Cai

TL;DR

The paper addresses universal approximation by replacing learnable weights with a finite vocabulary of flow maps. It develops a constructive theory showing that a finite set $V$ of flow maps, with $|V|=O(d^2)$, suffices to approximate any continuous map on a compact domain via compositions, and similarly yields $C$-UAP for orientation-preserving diffeomorphisms under $L^p$ norms. The approach combines dynamical-systems concepts (affine and leaky-ReLU flow maps), the Lie product formula, and Kronecker's approximation to realize arbitrary flows through finite compositions, and then proves a two-part construction that leads to an explicit $V$. Beyond function approximation, the work introduces a compositional flow-language model (CFSM) that can represent regular languages via flow grammars, providing a bridge between formal language theory and continuous-time mappings. The results offer a new perspective on compositionality in machine learning and NLP, suggesting that meaningful sentence meanings could be embedded as nonlinear mappings and composed from a finite, well-structured vocabulary of flow maps.

Abstract

In recent years, deep learning-based sequence modelings, such as language models, have received much attention and success, which pushes researchers to explore the possibility of transforming non-sequential problems into a sequential form. Following this thought, deep neural networks can be represented as composite functions of a sequence of mappings, linear or nonlinear, where each composition can be viewed as a \emph{word}. However, the weights of linear mappings are undetermined and hence require an infinite number of words. In this article, we investigate the finite case and constructively prove the existence of a finite \emph{vocabulary} $V=\{φ_i: \mathbb{R}^d \to \mathbb{R}^d | i=1,...,n\}$ with $n=O(d^2)$ for the universal approximation. That is, for any continuous mapping $f: \mathbb{R}^d \to \mathbb{R}^d$, compact domain $Ω$ and $\varepsilon>0$, there is a sequence of mappings $φ_{i_1}, ..., φ_{i_m} \in V, m \in \mathbb{Z}_+$, such that the composition $φ_{i_m} \circ ... \circ φ_{i_1} $ approximates $f$ on $Ω$ with an error less than $\varepsilon$. Our results demonstrate an unusual approximation power of mapping compositions and motivate a novel compositional model for regular languages.

Vocabulary for Universal Approximation: A Linguistic Perspective of Mapping Compositions

TL;DR

The paper addresses universal approximation by replacing learnable weights with a finite vocabulary of flow maps. It develops a constructive theory showing that a finite set of flow maps, with , suffices to approximate any continuous map on a compact domain via compositions, and similarly yields -UAP for orientation-preserving diffeomorphisms under norms. The approach combines dynamical-systems concepts (affine and leaky-ReLU flow maps), the Lie product formula, and Kronecker's approximation to realize arbitrary flows through finite compositions, and then proves a two-part construction that leads to an explicit . Beyond function approximation, the work introduces a compositional flow-language model (CFSM) that can represent regular languages via flow grammars, providing a bridge between formal language theory and continuous-time mappings. The results offer a new perspective on compositionality in machine learning and NLP, suggesting that meaningful sentence meanings could be embedded as nonlinear mappings and composed from a finite, well-structured vocabulary of flow maps.

Abstract

In recent years, deep learning-based sequence modelings, such as language models, have received much attention and success, which pushes researchers to explore the possibility of transforming non-sequential problems into a sequential form. Following this thought, deep neural networks can be represented as composite functions of a sequence of mappings, linear or nonlinear, where each composition can be viewed as a \emph{word}. However, the weights of linear mappings are undetermined and hence require an infinite number of words. In this article, we investigate the finite case and constructively prove the existence of a finite \emph{vocabulary} with for the universal approximation. That is, for any continuous mapping , compact domain and , there is a sequence of mappings , such that the composition approximates on with an error less than . Our results demonstrate an unusual approximation power of mapping compositions and motivate a novel compositional model for regular languages.
Paper Structure (35 sections, 19 theorems, 92 equations, 1 figure, 1 table)

This paper contains 35 sections, 19 theorems, 92 equations, 1 figure, 1 table.

Key Result

Theorem 2.2

Let $\Omega \subset \mathbb{R}^d$ be a compact domain. Then, there is a finite set $V \subset \overline{\text{Diff}_0(\mathbb{R}^d)}$ such that the hypothesis space $\mathcal{H}_V$ in Eq. eq:H_v has $C$-UAP for $\text{Diff}_0(\mathbb{R}^d)$.

Figures (1)

  • Figure 1: Example of (a) DFA and (b) flow grammar.

Theorems & Definitions (42)

  • Definition 2.1: Universal approximation property, UAP
  • Theorem 2.2
  • Corollary 2.3
  • Remark 2.4
  • Definition 2.5: Generalized leaky-ReLU
  • Theorem 2.6
  • Lemma 2.7
  • Proposition 3.1
  • Proposition 3.2
  • Theorem 3.3: Lie product formula
  • ...and 32 more