Remarks on Lipschitz-Minimal Interpolation: Generalization Bounds and Neural Network Implementation

Arthur C. B. de Oliveira; Ruigang Wang; Ian R. Manchester; Eduardo D. Sontag

Remarks on Lipschitz-Minimal Interpolation: Generalization Bounds and Neural Network Implementation

Arthur C. B. de Oliveira, Ruigang Wang, Ian R. Manchester, Eduardo D. Sontag

Abstract

This note establishes a theoretical framework for finding (potentially overparameterized) approximations of a function on a compact set with a-priori bounds for the generalization error. The approximation method considered is to choose, among all functions that (approximately) interpolate a given data set, one with a minimal Lipschitz constant. The paper establishes rigorous generalization bounds over practically relevant classes of approximators, including deep neural networks. It also presents a neural network implementation based on Lipschitz-bounded network layers and an augmented Lagrangian method. The results are illustrated for a problem of learning the dynamics of an input-to-state stable system with certified bounds on simulation error.

Remarks on Lipschitz-Minimal Interpolation: Generalization Bounds and Neural Network Implementation

Abstract

Paper Structure (13 sections, 7 theorems, 61 equations, 3 figures, 3 tables)

This paper contains 13 sections, 7 theorems, 61 equations, 3 figures, 3 tables.

Introduction
Theoretical Setup and Preliminaries
Preliminary Definitions
Initial problem statement
Lipschitz-Minimal Interpolation
Optimization over a subset of Lipschitz functions
The problem under random sampling
Discussion and extension to general measures over $\mathcal{D}$
Illustrative Application: Stable Vector Field Estimation
Neural Network Implementation
Numerical Example
Conclusions
Alternate proof for Theorem \ref{['thm:ThrGuarantee2']}(i)

Key Result

Lemma 1

Given $g\in L(\mathcal{D})$, let $\mathbb{D}_{N}^{\overline\varepsilon}$ be a noisy dataset with noise bound $\overline\varepsilon$. Furthermore, for any $\varepsilon>0$ let $f^*$ be any function in $L(\mathcal{D})$ that satisfies $\ell(\mathbb{D}_{N}^{\overline\varepsilon},f^*)\leq \varepsilon$, wi

Figures (3)

Figure 1: Trajectory errors of different models over 500 test data samples.
Figure 2: A trajectory sample of learnt models.
Figure 3: The output channel $f_i$ of the learnt models with fixed $x_2$ over the region $(x_1, x_3) \in [-2,2]\times[-2,2]$.

Theorems & Definitions (16)

Lemma 1
proof
Definition 1: Fill Distance/Covering Radius reznikov2016covering
Corollary 1
proof
Theorem 1
proof
Theorem 2
proof
Corollary 2
...and 6 more

Remarks on Lipschitz-Minimal Interpolation: Generalization Bounds and Neural Network Implementation

Abstract

Remarks on Lipschitz-Minimal Interpolation: Generalization Bounds and Neural Network Implementation

Authors

Abstract

Table of Contents

Key Result

Figures (3)

Theorems & Definitions (16)