Neural networks in non-metric spaces

Luca Galimberti

Neural networks in non-metric spaces

Luca Galimberti

TL;DR

The paper broadens neural network theory to input spaces that are non-metric, by embedding quasi-Polish spaces into a separable Hilbert space and composing infinite-dimensional networks with a data-embedding map $F$. It proves universal approximation for scalar and vector-valued targets, including locally convex and general topological vector spaces, and shows how to obtain finite-parameter, practically trainable architectures through projections and truncations. It also extends the framework to targets that are quasi-Polish themselves, using a finite-range projection to map back into the target space, at the cost of Borel measurability. A key obstruction result justifies the necessity of the quasi-Polish setting for achieving expressive, stable, and implementable infinite-dimensional approximators. Collectively, the work provides a rigorous foundation for learning with functional inputs and broadens the applicability of neural operators to a wide class of infinite-dimensional spaces.

Abstract

Leveraging the infinite dimensional neural network architecture we proposed in arXiv:2109.13512v4 and which can process inputs from Fréchet spaces, and using the universal approximation property shown therein, we now largely extend the scope of this architecture by proving several universal approximation theorems for a vast class of input and output spaces. More precisely, the input space $\mathfrak X$ is allowed to be a general topological space satisfying only a mild condition ("quasi-Polish"), and the output space can be either another quasi-Polish space $\mathfrak Y$ or a topological vector space $E$. Similarly to arXiv:2109.13512v4, we show furthermore that our neural network architectures can be projected down to "finite dimensional" subspaces with any desirable accuracy, thus obtaining approximating networks that are easy to implement and allow for fast computation and fitting. The resulting neural network architecture is therefore applicable for prediction tasks based on functional data. To the best of our knowledge, this is the first result which deals with such a wide class of input/output spaces and simultaneously guarantees the numerical feasibility of the ensuing architectures. Finally, we prove an obstruction result which indicates that the category of quasi-Polish spaces is in a certain sense the correct category to work with if one aims at constructing approximating architectures on infinite-dimensional spaces $\mathfrak X$ which, at the same time, have sufficient expressive power to approximate continuous functions on $\mathfrak X$, are specified by a finite number of parameters only and are "stable" with respect to these parameters.

Neural networks in non-metric spaces

TL;DR

. It proves universal approximation for scalar and vector-valued targets, including locally convex and general topological vector spaces, and shows how to obtain finite-parameter, practically trainable architectures through projections and truncations. It also extends the framework to targets that are quasi-Polish themselves, using a finite-range projection to map back into the target space, at the cost of Borel measurability. A key obstruction result justifies the necessity of the quasi-Polish setting for achieving expressive, stable, and implementable infinite-dimensional approximators. Collectively, the work provides a rigorous foundation for learning with functional inputs and broadens the applicability of neural operators to a wide class of infinite-dimensional spaces.

Abstract

is allowed to be a general topological space satisfying only a mild condition ("quasi-Polish"), and the output space can be either another quasi-Polish space

or a topological vector space

. Similarly to arXiv:2109.13512v4, we show furthermore that our neural network architectures can be projected down to "finite dimensional" subspaces with any desirable accuracy, thus obtaining approximating networks that are easy to implement and allow for fast computation and fitting. The resulting neural network architecture is therefore applicable for prediction tasks based on functional data. To the best of our knowledge, this is the first result which deals with such a wide class of input/output spaces and simultaneously guarantees the numerical feasibility of the ensuing architectures. Finally, we prove an obstruction result which indicates that the category of quasi-Polish spaces is in a certain sense the correct category to work with if one aims at constructing approximating architectures on infinite-dimensional spaces

which, at the same time, have sufficient expressive power to approximate continuous functions on

, are specified by a finite number of parameters only and are "stable" with respect to these parameters.

Paper Structure (18 sections, 20 theorems, 208 equations)

This paper contains 18 sections, 20 theorems, 208 equations.

Introduction
Related literature and comparison with other results
Outline
Preliminaries
Notation and conventions
A primer on quasi-Polish spaces
Examples of quasi-Polish spaces
Infinite-dimensional neural networks
Quasi-Polish neural networks architectures
Main results
Universal approximation theorem for quasi-Polish spaces: the scalar case
Universal approximation results for quasi-Polish spaces: the vector-valued case
The target space is a locally convex space
The target space is a topological vector space
Universal approximation results for targets that are quasi-Polish
...and 3 more sections

Key Result

Theorem 2.17

Let $\sigma:V\to V$ be continuous, satisfying eq: abstract condition on sigma and with bounded range $\sigma(V)$. Then $\mathfrak N(\sigma)$ is dense in $C(V)$ when equipped with the topology of uniform convergence on compacts. In other words, given $f\in C(V)$, then, for any compact subset $K$ of $

Theorems & Definitions (54)

Definition 2.1
Example 2.2: Separable metrizable spaces
Example 2.3: Fréchet spaces carrying a Schauder basis
Example 2.4: Separable normed spaces
Example 2.5: Separable normed space with the weak topology
Example 2.6: Weak-star topology
Example 2.7: Countable Cartesian product
Example 2.8: Space of bounded linear operators
Example 2.9: $C_b(\mathbb R^d)$
Example 2.10: $L^\infty(\Omega,\mathcal{A},\mu)$
...and 44 more

Neural networks in non-metric spaces

TL;DR

Abstract

Neural networks in non-metric spaces

Authors

TL;DR

Abstract

Table of Contents

Key Result

Theorems & Definitions (54)