Approximation by non-symmetric networks for cross-domain learning

Hrushikesh Mhaskar

Approximation by non-symmetric networks for cross-domain learning

Hrushikesh Mhaskar

TL;DR

A general approach to study the approximation capabilities of kernel based networks using non-symmetric kernels using generalized translation networks and rotated zonal function kernels is initiated.

Abstract

For the past 30 years or so, machine learning has stimulated a great deal of research in the study of approximation capabilities (expressive power) of a multitude of processes, such as approximation by shallow or deep neural networks, radial basis function networks, and a variety of kernel based methods. Motivated by applications such as invariant learning, transfer learning, and synthetic aperture radar imaging, we initiate in this paper a general approach to study the approximation capabilities of kernel based networks using non-symmetric kernels. While singular value decomposition is a natural instinct to study such kernels, we consider a more general approach to include the use of a family of kernels, such as generalized translation networks (which include neural networks and translation invariant kernels as special cases) and rotated zonal function kernels. Naturally, unlike traditional kernel based approximation, we cannot require the kernels to be positive definite. In particular, we obtain estimates on the accuracy of uniform approximation of functions in a Sobolev class by ReLU$^r$ networks when $r$ is not necessarily an integer. Our general results apply to the approximation of functions with small smoothness compared to the dimension of the input space.

Approximation by non-symmetric networks for cross-domain learning

TL;DR

A general approach to study the approximation capabilities of kernel based networks using non-symmetric kernels using generalized translation networks and rotated zonal function kernels is initiated.

Abstract

networks when

is not necessarily an integer. Our general results apply to the approximation of functions with small smoothness compared to the dimension of the input space.

Paper Structure (13 sections, 16 theorems, 137 equations)

This paper contains 13 sections, 16 theorems, 137 equations.

Introduction
General introduction
Technical introduction
Outline of the paper
Data spaces
Basic concepts
Degree of approximation
Smoothness classes
Asymmetric eignets
Main results
ReLU$^r$ networks
Proofs
Conclusions

Key Result

Proposition 2.1

Let $S>q+1$ be an integer, $H:\mathbb{R}\to \mathbb{R}$ be an even, $S$ times continuously differentiable, compactly supported function. Then for every $x,y\in \mathbb{X}$, $N\ge 1$, where the constant may depend upon $H$ and $S$, but not on $N$, $x$, or $y$.

Theorems & Definitions (39)

Definition 2.1
Example 2.1
Example 2.2
Example 2.3
Remark 2.1
Proposition 2.1
Theorem 2.1
Definition 2.2
Definition 2.3
Remark 2.2
...and 29 more

Approximation by non-symmetric networks for cross-domain learning

TL;DR

Abstract

Approximation by non-symmetric networks for cross-domain learning

Authors

TL;DR

Abstract

Table of Contents

Key Result

Theorems & Definitions (39)