Neural networks in non-metric spaces
Luca Galimberti
TL;DR
The paper broadens neural network theory to input spaces that are non-metric, by embedding quasi-Polish spaces into a separable Hilbert space and composing infinite-dimensional networks with a data-embedding map $F$. It proves universal approximation for scalar and vector-valued targets, including locally convex and general topological vector spaces, and shows how to obtain finite-parameter, practically trainable architectures through projections and truncations. It also extends the framework to targets that are quasi-Polish themselves, using a finite-range projection to map back into the target space, at the cost of Borel measurability. A key obstruction result justifies the necessity of the quasi-Polish setting for achieving expressive, stable, and implementable infinite-dimensional approximators. Collectively, the work provides a rigorous foundation for learning with functional inputs and broadens the applicability of neural operators to a wide class of infinite-dimensional spaces.
Abstract
Leveraging the infinite dimensional neural network architecture we proposed in arXiv:2109.13512v4 and which can process inputs from Fréchet spaces, and using the universal approximation property shown therein, we now largely extend the scope of this architecture by proving several universal approximation theorems for a vast class of input and output spaces. More precisely, the input space $\mathfrak X$ is allowed to be a general topological space satisfying only a mild condition ("quasi-Polish"), and the output space can be either another quasi-Polish space $\mathfrak Y$ or a topological vector space $E$. Similarly to arXiv:2109.13512v4, we show furthermore that our neural network architectures can be projected down to "finite dimensional" subspaces with any desirable accuracy, thus obtaining approximating networks that are easy to implement and allow for fast computation and fitting. The resulting neural network architecture is therefore applicable for prediction tasks based on functional data. To the best of our knowledge, this is the first result which deals with such a wide class of input/output spaces and simultaneously guarantees the numerical feasibility of the ensuing architectures. Finally, we prove an obstruction result which indicates that the category of quasi-Polish spaces is in a certain sense the correct category to work with if one aims at constructing approximating architectures on infinite-dimensional spaces $\mathfrak X$ which, at the same time, have sufficient expressive power to approximate continuous functions on $\mathfrak X$, are specified by a finite number of parameters only and are "stable" with respect to these parameters.
