Vocabulary for Universal Approximation: A Linguistic Perspective of Mapping Compositions

Yongqiang Cai

Vocabulary for Universal Approximation: A Linguistic Perspective of Mapping Compositions

Yongqiang Cai

TL;DR

The paper addresses universal approximation by replacing learnable weights with a finite vocabulary of flow maps. It develops a constructive theory showing that a finite set $V$ of flow maps, with $|V|=O(d^2)$, suffices to approximate any continuous map on a compact domain via compositions, and similarly yields $C$-UAP for orientation-preserving diffeomorphisms under $L^p$ norms. The approach combines dynamical-systems concepts (affine and leaky-ReLU flow maps), the Lie product formula, and Kronecker's approximation to realize arbitrary flows through finite compositions, and then proves a two-part construction that leads to an explicit $V$. Beyond function approximation, the work introduces a compositional flow-language model (CFSM) that can represent regular languages via flow grammars, providing a bridge between formal language theory and continuous-time mappings. The results offer a new perspective on compositionality in machine learning and NLP, suggesting that meaningful sentence meanings could be embedded as nonlinear mappings and composed from a finite, well-structured vocabulary of flow maps.

Abstract

In recent years, deep learning-based sequence modelings, such as language models, have received much attention and success, which pushes researchers to explore the possibility of transforming non-sequential problems into a sequential form. Following this thought, deep neural networks can be represented as composite functions of a sequence of mappings, linear or nonlinear, where each composition can be viewed as a \emph{word}. However, the weights of linear mappings are undetermined and hence require an infinite number of words. In this article, we investigate the finite case and constructively prove the existence of a finite \emph{vocabulary} $V=\{φ_i: \mathbb{R}^d \to \mathbb{R}^d | i=1,...,n\}$ with $n=O(d^2)$ for the universal approximation. That is, for any continuous mapping $f: \mathbb{R}^d \to \mathbb{R}^d$, compact domain $Ω$ and $\varepsilon>0$, there is a sequence of mappings $φ_{i_1}, ..., φ_{i_m} \in V, m \in \mathbb{Z}_+$, such that the composition $φ_{i_m} \circ ... \circ φ_{i_1} $ approximates $f$ on $Ω$ with an error less than $\varepsilon$. Our results demonstrate an unusual approximation power of mapping compositions and motivate a novel compositional model for regular languages.

Vocabulary for Universal Approximation: A Linguistic Perspective of Mapping Compositions

TL;DR

The paper addresses universal approximation by replacing learnable weights with a finite vocabulary of flow maps. It develops a constructive theory showing that a finite set

of flow maps, with

, suffices to approximate any continuous map on a compact domain via compositions, and similarly yields

-UAP for orientation-preserving diffeomorphisms under

norms. The approach combines dynamical-systems concepts (affine and leaky-ReLU flow maps), the Lie product formula, and Kronecker's approximation to realize arbitrary flows through finite compositions, and then proves a two-part construction that leads to an explicit

. Beyond function approximation, the work introduces a compositional flow-language model (CFSM) that can represent regular languages via flow grammars, providing a bridge between formal language theory and continuous-time mappings. The results offer a new perspective on compositionality in machine learning and NLP, suggesting that meaningful sentence meanings could be embedded as nonlinear mappings and composed from a finite, well-structured vocabulary of flow maps.

Abstract

with

for the universal approximation. That is, for any continuous mapping

, compact domain

and

, there is a sequence of mappings

, such that the composition

approximates

with an error less than

. Our results demonstrate an unusual approximation power of mapping compositions and motivate a novel compositional model for regular languages.

Paper Structure (35 sections, 19 theorems, 92 equations, 1 figure, 1 table)

This paper contains 35 sections, 19 theorems, 92 equations, 1 figure, 1 table.

Introduction
Contributions
Related works
Outline
Notations and main results
Preliminaries
Notations
Main theorem
Sketch of the proof
Proof of the construction Part 1
Affine flows and leaky-ReLU flows
Application of Lie product formula
Application of Kronecker's theorem
Proof of the construction Part 2
Approximate the OP diffeomorphism by deep compositions
...and 20 more sections

Key Result

Theorem 2.2

Let $\Omega \subset \mathbb{R}^d$ be a compact domain. Then, there is a finite set $V \subset \overline{\text{Diff}_0(\mathbb{R}^d)}$ such that the hypothesis space $\mathcal{H}_V$ in Eq. eq:H_v has $C$-UAP for $\text{Diff}_0(\mathbb{R}^d)$.

Figures (1)

Figure 1: Example of (a) DFA and (b) flow grammar.

Theorems & Definitions (42)

Definition 2.1: Universal approximation property, UAP
Theorem 2.2
Corollary 2.3
Remark 2.4
Definition 2.5: Generalized leaky-ReLU
Theorem 2.6
Lemma 2.7
Proposition 3.1
Proposition 3.2
Theorem 3.3: Lie product formula
...and 32 more

Vocabulary for Universal Approximation: A Linguistic Perspective of Mapping Compositions

TL;DR

Abstract

Vocabulary for Universal Approximation: A Linguistic Perspective of Mapping Compositions

Authors

TL;DR

Abstract

Table of Contents

Key Result

Figures (1)

Theorems & Definitions (42)