Table of Contents
Fetching ...

Bounds on Deep Neural Network Partial Derivatives with Respect to Parameters

Omkar Sudhir Patil, Brandon C. Fallin, Cristian F. Nino, Rebecca G. Hart, Warren E. Dixon

TL;DR

This work tackles the need for explicit, computable bounds on the parameter-derivatives of deep neural networks to enable Lyapunov-based stability guarantees in real-time control. It develops rigorous polynomial bounds on the first and second partial derivatives of fully-connected DNNs with respect to the parameter vector $\theta$, accommodating $sigmoidal$ and $ReLU$-like activations and providing closed-form, computable expressions. The authors introduce structured bounds for layer outputs, Jacobians, and Hessians, including auxiliary quantities $\mathcal{Q}_j$, $\mathcal{R}_{w,q,j}$, and $\mathcal{T}_{w,j}$, and derive a bound on the mixed second derivatives and the overall Jacobian $\partial \Phi / \partial \theta$. They further bound the higher-order Taylor remainder $R(\sigma, \tilde{\theta})$ by a polynomial function $\rho_0(\Vert \sigma \Vert)$ times $\Vert \tilde{\theta} \Vert^2$, enabling precise convergence and stability analyses for gradient-based learning in safety-critical control systems.

Abstract

Deep neural networks (DNNs) have emerged as a powerful tool with a growing body of literature exploring Lyapunov-based approaches for real-time system identification and control. These methods depend on establishing bounds for the second partial derivatives of DNNs with respect to their parameters, a requirement often assumed but rarely addressed explicitly. This paper provides rigorous mathematical formulations of polynomial bounds on both the first and second partial derivatives of DNNs with respect to their parameters. We present lemmas that characterize these bounds for fully-connected DNNs, while accommodating various classes of activation function including sigmoidal and ReLU-like functions. Our analysis yields closed-form expressions that enable precise stability guarantees for Lyapunov-based deep neural networks (Lb-DNNs). Furthermore, we extend our results to bound the higher-order terms in first-order Taylor approximations of DNNs, providing important tools for convergence analysis in gradient-based learning algorithms. The developed theoretical framework develops explicit, computable expressions, for previously assumed bounds, thereby strengthening the mathematical foundation of neural network applications in safety-critical control systems.

Bounds on Deep Neural Network Partial Derivatives with Respect to Parameters

TL;DR

This work tackles the need for explicit, computable bounds on the parameter-derivatives of deep neural networks to enable Lyapunov-based stability guarantees in real-time control. It develops rigorous polynomial bounds on the first and second partial derivatives of fully-connected DNNs with respect to the parameter vector , accommodating and -like activations and providing closed-form, computable expressions. The authors introduce structured bounds for layer outputs, Jacobians, and Hessians, including auxiliary quantities , , and , and derive a bound on the mixed second derivatives and the overall Jacobian . They further bound the higher-order Taylor remainder by a polynomial function times , enabling precise convergence and stability analyses for gradient-based learning in safety-critical control systems.

Abstract

Deep neural networks (DNNs) have emerged as a powerful tool with a growing body of literature exploring Lyapunov-based approaches for real-time system identification and control. These methods depend on establishing bounds for the second partial derivatives of DNNs with respect to their parameters, a requirement often assumed but rarely addressed explicitly. This paper provides rigorous mathematical formulations of polynomial bounds on both the first and second partial derivatives of DNNs with respect to their parameters. We present lemmas that characterize these bounds for fully-connected DNNs, while accommodating various classes of activation function including sigmoidal and ReLU-like functions. Our analysis yields closed-form expressions that enable precise stability guarantees for Lyapunov-based deep neural networks (Lb-DNNs). Furthermore, we extend our results to bound the higher-order terms in first-order Taylor approximations of DNNs, providing important tools for convergence analysis in gradient-based learning algorithms. The developed theoretical framework develops explicit, computable expressions, for previously assumed bounds, thereby strengthening the mathematical foundation of neural network applications in safety-critical control systems.

Paper Structure

This paper contains 4 sections, 4 theorems, 29 equations.

Key Result

Lemma 1

For the DNN architecture described in (eq:phij_dnn), the output of the $j^{th}$ layer of the DNN, $\Phi_{j}$, is bounded as for all $j\in\left\{ 0,\ldots,k\right\}$, and the corresponding activation $\phi_{j}\left(\Phi_{j-1}\right)$ is bounded as for all $j\in\left\{ 1,\ldots,k\right\}$. Furthermore, if the bound $\left\Vert V_{j}\right\Vert \leq\bar{\theta}$ is applied for all $j\in\left\{ 0,\l

Theorems & Definitions (9)

  • Remark 1
  • Lemma 1
  • proof
  • Lemma 2
  • proof
  • Lemma 3
  • proof
  • Theorem 1
  • proof