Table of Contents
Fetching ...

Geometric separation and constructive universal approximation with two hidden layers

Chanyoung Sung

TL;DR

This work provides a constructive universal approximation framework using neural networks with two hidden layers (depth 3) for any compact K⊂R^n and any f∈C(K), applicable to both sigmoidal and ReLU activations. Central to the method are Urysohn-type separation lemmas that enable explicit geometric separation of disjoint compact sets, with a second hidden layer acting as a selector that aggregates multiple separation gadgets. The authors show dense approximation by iteratively reducing oscillation via a convergent sum of depth-3 networks, and they prove a sharp depth-2 result for finite K, where a finite family of separating functions yields the approximation. While the construction is not width-efficient, it illuminates a geometric trade-off between depth and explicit realizability, and it unifies constructive approaches across activation types.

Abstract

We give a geometric construction of neural networks that separate disjoint compact subsets of $\Bbb R^n$, and use it to obtain a constructive universal approximation theorem. Specifically, we show that networks with two hidden layers and either a sigmoidal activation (i.e., strictly monotone bounded continuous) or the ReLU activation can approximate any real-valued continuous function on an arbitrary compact set $K\subset\Bbb R^n$ to any prescribed accuracy in the uniform norm. For finite $K$, the construction simplifies and yields a sharp depth-2 (single hidden layer) approximation result.

Geometric separation and constructive universal approximation with two hidden layers

TL;DR

This work provides a constructive universal approximation framework using neural networks with two hidden layers (depth 3) for any compact K⊂R^n and any f∈C(K), applicable to both sigmoidal and ReLU activations. Central to the method are Urysohn-type separation lemmas that enable explicit geometric separation of disjoint compact sets, with a second hidden layer acting as a selector that aggregates multiple separation gadgets. The authors show dense approximation by iteratively reducing oscillation via a convergent sum of depth-3 networks, and they prove a sharp depth-2 result for finite K, where a finite family of separating functions yields the approximation. While the construction is not width-efficient, it illuminates a geometric trade-off between depth and explicit realizability, and it unifies constructive approaches across activation types.

Abstract

We give a geometric construction of neural networks that separate disjoint compact subsets of , and use it to obtain a constructive universal approximation theorem. Specifically, we show that networks with two hidden layers and either a sigmoidal activation (i.e., strictly monotone bounded continuous) or the ReLU activation can approximate any real-valued continuous function on an arbitrary compact set to any prescribed accuracy in the uniform norm. For finite , the construction simplifies and yields a sharp depth-2 (single hidden layer) approximation result.
Paper Structure (5 sections, 6 theorems, 70 equations, 3 figures)

This paper contains 5 sections, 6 theorems, 70 equations, 3 figures.

Key Result

Lemma 3.1

Let $A$ and $\{\bold{p}\}$ be nonempty disjoint closed subsets in $\Bbb R^n$. Then for any $\epsilon>0$ there exists $h\in \mathcal{N}_{2}$ such that $h(\bold{p})<\epsilon$, $h>1-\epsilon$ on $A$, and $h(\Bbb R^n)\subseteq (0,1)$.

Figures (3)

  • Figure 1: $\sigma(c(x-\frac{3\delta}{2})), \sigma(c(-x-\frac{3\delta}{2}))$ and $\psi$
  • Figure 2: $\cup_{i=1}^n\pi_i^{-1}((-2\delta,2\delta))$ when $n=2$ and $2\delta=0.05$
  • Figure 3: $\varphi_{1,1}$ and $\varphi_{1,1,2}$

Theorems & Definitions (12)

  • Lemma 3.1
  • proof
  • Lemma 3.2
  • proof
  • Lemma 3.3
  • proof
  • Lemma 3.4
  • proof
  • Theorem 4.1
  • proof
  • ...and 2 more