Table of Contents
Fetching ...

Exact representation and efficient approximations of linear model predictive control laws via HardTanh type deep neural networks

Daniela Lupu, Ion Necoara

TL;DR

It is shown that neural networks with HardTanh activation functions can exactly represent predictive control laws of linear time-invariant systems, and theoretical bounds on the minimum number of hidden layers and neurons that a HardTanh neural network should have to exactly represent a given predictive control law are derived.

Abstract

Deep neural networks have revolutionized many fields, including image processing, inverse problems, text mining and more recently, give very promising results in systems and control. Neural networks with hidden layers have a strong potential as an approximation framework of predictive control laws as they usually yield better approximation quality and smaller memory requirements than existing explicit (multi-parametric) approaches. In this paper, we first show that neural networks with HardTanh activation functions can exactly represent predictive control laws of linear time-invariant systems. We derive theoretical bounds on the minimum number of hidden layers and neurons that a HardTanh neural network should have to exactly represent a given predictive control law. The choice of HardTanh deep neural networks is particularly suited for linear predictive control laws as they usually require less hidden layers and neurons than deep neural networks with ReLU units for representing exactly continuous piecewise affine (or equivalently min-max) maps. In the second part of the paper we bring the physics of the model and standard optimization techniques into the architecture design, in order to eliminate the disadvantages of the black-box HardTanh learning. More specifically, we design trainable unfolded HardTanh deep architectures for learning linear predictive control laws based on two standard iterative optimization algorithms, i.e., projected gradient descent and accelerated projected gradient descent. We also study the performance of the proposed HardTanh type deep neural networks on a linear model predictive control application.

Exact representation and efficient approximations of linear model predictive control laws via HardTanh type deep neural networks

TL;DR

It is shown that neural networks with HardTanh activation functions can exactly represent predictive control laws of linear time-invariant systems, and theoretical bounds on the minimum number of hidden layers and neurons that a HardTanh neural network should have to exactly represent a given predictive control law are derived.

Abstract

Deep neural networks have revolutionized many fields, including image processing, inverse problems, text mining and more recently, give very promising results in systems and control. Neural networks with hidden layers have a strong potential as an approximation framework of predictive control laws as they usually yield better approximation quality and smaller memory requirements than existing explicit (multi-parametric) approaches. In this paper, we first show that neural networks with HardTanh activation functions can exactly represent predictive control laws of linear time-invariant systems. We derive theoretical bounds on the minimum number of hidden layers and neurons that a HardTanh neural network should have to exactly represent a given predictive control law. The choice of HardTanh deep neural networks is particularly suited for linear predictive control laws as they usually require less hidden layers and neurons than deep neural networks with ReLU units for representing exactly continuous piecewise affine (or equivalently min-max) maps. In the second part of the paper we bring the physics of the model and standard optimization techniques into the architecture design, in order to eliminate the disadvantages of the black-box HardTanh learning. More specifically, we design trainable unfolded HardTanh deep architectures for learning linear predictive control laws based on two standard iterative optimization algorithms, i.e., projected gradient descent and accelerated projected gradient descent. We also study the performance of the proposed HardTanh type deep neural networks on a linear model predictive control application.
Paper Structure (10 sections, 8 theorems, 57 equations, 2 figures, 2 tables)

This paper contains 10 sections, 8 theorems, 57 equations, 2 figures, 2 tables.

Key Result

Lemma 1

If $\mathcal{P}_{WA}: \mathbb{X} \rightarrow \mathbb{R}$ is a piecewise affine with affine selection functions $\mathcal{A}_1(x)=c_1^T x+d_1, \ldots, \mathcal{A}_m(x)=c_m^T x+d_m$, then there exists a finite number of index sets $\mathcal{C}_1, \ldots, \mathcal{C}_l \subseteq\{1, \ldots, m\}$ such t

Figures (2)

  • Figure 1: System of 2 and 3 oscillating masses connected each other through pairs of spring-damper blocks, and to walls (dark blocks on the sides).
  • Figure 2: Closed-loop oscillating 2 masses system (a): comparison between explicit MPC (MPT), APGD and learned MPC laws using HTNN, U-HTNN, S-U-HTNN and SS-U-HTNN networks - inputs (top) and states (bottom) trajectories for the initial state $x_0 = [4,\, 10,\, -1,\, -1]^T$.

Theorems & Definitions (9)

  • Lemma 1
  • Lemma 2
  • Lemma 3
  • Lemma 4
  • Lemma 5
  • Theorem 1
  • Theorem 2
  • Remark 1
  • Theorem 3