RAYEN: Imposition of Hard Convex Constraints on Neural Networks

Jesus Tordesillas; Jonathan P. How; Marco Hutter

RAYEN: Imposition of Hard Convex Constraints on Neural Networks

Jesus Tordesillas, Jonathan P. How, Marco Hutter

TL;DR

RAYEN addresses the challenge of enforcing hard convex constraints in neural networks by introducing an offline/online framework that maps neural latents to a convex feasible region without projection or soft penalties. It computes an affine hull of the constraint set and uses an interior point in a reduced subspace to guarantee feasibility, with online κ-based computations that adjust steps from z0 to stay within the set for linear, quadratic, SOC, and LMI constraints. The approach delivers strict feasibility at test time while delivering substantial speedups over projection-based methods and maintaining costs near the optimum in constrained optimization tasks. Limitations include its current restriction to convex constraints and the need for carefully managed non-fixed constraints; future work explores extensions to nonconvex constraints, additional constraint families, and applications to safe robotics control.

Abstract

This paper presents RAYEN, a framework to impose hard convex constraints on the output or latent variable of a neural network. RAYEN guarantees that, for any input or any weights of the network, the constraints are satisfied at all times. Compared to other approaches, RAYEN does not perform a computationally-expensive orthogonal projection step onto the feasible set, does not rely on soft constraints (which do not guarantee the satisfaction of the constraints at test time), does not use conservative approximations of the feasible set, and does not perform a potentially slow inner gradient descent correction to enforce the constraints. RAYEN supports any combination of linear, convex quadratic, second-order cone (SOC), and linear matrix inequality (LMI) constraints, achieving a very small computational overhead compared to unconstrained networks. For example, it is able to impose 1K quadratic constraints on a 1K-dimensional variable with an overhead of less than 8 ms, and an LMI constraint with 300x300 dense matrices on a 10K-dimensional variable in less than 12 ms. When used in neural networks that approximate the solution of constrained optimization problems, RAYEN achieves computation times between 20 and 7468 times faster than state-of-the-art algorithms, while guaranteeing the satisfaction of the constraints at all times and obtaining a cost very close to the optimal one.

RAYEN: Imposition of Hard Convex Constraints on Neural Networks

TL;DR

Abstract

Paper Structure (23 sections, 33 equations, 9 figures, 7 tables)

This paper contains 23 sections, 33 equations, 9 figures, 7 tables.

Introduction and Related Work
Problem Setup
RAYEN: Offline
Affine Hull of $\mathcal{Y}$
Set $\mathcal{Z}$
Interior point of $\mathcal{Z}$
RAYEN: Online
Computation of $\kappa_{L}$
Computation of $\kappa_{Q}$
Computation of $\kappa_{S}$
Computation of $\kappa_{M}$
Remarks
Results
Optimization problems
Computation time
...and 8 more sections

Figures (9)

Figure 1: RAYEN applied to a batch of 500 samples with a feasible set (①) defined by linear, convex quadratic, SOC, and LMI constraints. For each sample in the batch, RAYEN lets the corresponding latent variable of the network be the vector that defines the step to take from an interior point of the feasible set (②). The length of this vector is then adjusted to ensure that the endpoint lies within the set (③). For visualization purposes, a section of the set has been removed in the right plots.
Figure 2: Sets $\mathcal{Y}\subseteq\mathbb{R}^{k}$ and $\mathcal{Z}\subseteq\mathbb{R}^{n}$. $\boldsymbol{z}_{0}$ is a point in the interior of $\mathcal{Z}$, and any point $\boldsymbol{z}\in\mathcal{Z}$ is mapped to its corresponding point $\boldsymbol{y}\in\mathcal{Y}$ using $\boldsymbol{y}=\boldsymbol{f}(\boldsymbol{z})$. In this example, the dimension of the ambient space is $k=3$, while the dimension of $\operatorname{aff}\left(\mathcal{Y}\right)$ is $n=2$. For visualization purposes, here $\mathcal{Y}$ is defined by only linear and quadratic constraints. The values of $\kappa_{L}$ and $\kappa_{Q}$ (inverses distances to, respectively, $\partial\mathcal{Z}_{L}$ and $\partial\mathcal{Z}_{Q}$ along the direction $\bar{\boldsymbol{v}}:=\frac{\boldsymbol{v}}{\left\Vert \boldsymbol{v}\right\Vert }$, see Section \ref{['sec:Online']}), are also shown. In this figure, $\boldsymbol{y}_{0}:=\boldsymbol{f}(\boldsymbol{z}_{0})$ and $\boldsymbol{y}_{1}:=\boldsymbol{f}(\boldsymbol{z}_{1})$.
Figure 3: Neural network equipped with RAYEN. $\kappa$ is computed as shown in Table \ref{['tab:definition_kappas']}, and the map $\boldsymbol{f}\left(\boldsymbol{z}_{0}+\text{min}\left(\frac{1}{\kappa},\left\Vert \boldsymbol{v}\right\Vert \right)\bar{\boldsymbol{v}}\right)$ is used to guarantee $\boldsymbol{y}\in\mathcal{Y}$. Here, L($m,n$) denotes a linear layer with input size $m$ and output size $n$. This linear layer is not needed if $m=n$. After the RAYEN module, there may be more downstream layers. During the training procedure, gradients can then be backpropagated through RAYEN.
Figure 4: Geometric meaning of $\kappa_{L}$, $\kappa_{Q}$, $\kappa_{S}$, $\kappa_{M}$, and $\kappa$ for a given value of $\bar{\boldsymbol{v}}$. The green segment indicates the direction of $\bar{\boldsymbol{v}}$. In this example, $n=3$, $\eta=1$ (i.e., only one quadratic constraint), and $\mu=1$ (i.e., only one SOC constraint). Here, $\boldsymbol{t}$ ($\boldsymbol{t}_L$, $\boldsymbol{t}_Q$, $\boldsymbol{t}_S$, $\boldsymbol{t}_M$) denotes the intersection between the ray that starts at $\boldsymbol{z}_0$ and follows $\bar{\boldsymbol{v}}$ with $\partial\mathcal{Z}$ ($\partial\mathcal{Z}_L$, $\partial\mathcal{Z}_Q$, $\partial\mathcal{Z}_S$, $\partial\mathcal{Z}_M$ respectively). These rays are the ones that give the name to the algorithm (RAYEN).
Figure 5: Eigenvalues of $\delta\boldsymbol{H}+\boldsymbol{S}$ as a function of $\delta$. In this examples, $\boldsymbol{H}\succ\boldsymbol{0}$ and $\boldsymbol{S}$ are $3\times3$ matrices. $\kappa_M$, defined in Eq. \ref{['eq:definitionkM']}, is also shown here.
...and 4 more figures

RAYEN: Imposition of Hard Convex Constraints on Neural Networks

TL;DR

Abstract

RAYEN: Imposition of Hard Convex Constraints on Neural Networks

Authors

TL;DR

Abstract

Table of Contents

Figures (9)