Table of Contents
Fetching ...

Taming Binarized Neural Networks and Mixed-Integer Programs

Johannes Aspman, Georgios Korpas, Jakub Marecek

TL;DR

It is shown that binarized neural networks admit a tame representation by reformulating the problem of training binarized neural networks as a subadditive dual of a mixed-integer program, which is shown to have nice properties and makes it possible to use the framework of Bolte et al. for implicit differentiation.

Abstract

There has been a great deal of recent interest in binarized neural networks, especially because of their explainability. At the same time, automatic differentiation algorithms such as backpropagation fail for binarized neural networks, which limits their applicability. By reformulating the problem of training binarized neural networks as a subadditive dual of a mixed-integer program, we show that binarized neural networks admit a tame representation. This, in turn, makes it possible to use the framework of Bolte et al. for implicit differentiation, which offers the possibility for practical implementation of backpropagation in the context of binarized neural networks. This approach could also be used for a broader class of mixed-integer programs, beyond the training of binarized neural networks, as encountered in symbolic approaches to AI and beyond.

Taming Binarized Neural Networks and Mixed-Integer Programs

TL;DR

It is shown that binarized neural networks admit a tame representation by reformulating the problem of training binarized neural networks as a subadditive dual of a mixed-integer program, which is shown to have nice properties and makes it possible to use the framework of Bolte et al. for implicit differentiation.

Abstract

There has been a great deal of recent interest in binarized neural networks, especially because of their explainability. At the same time, automatic differentiation algorithms such as backpropagation fail for binarized neural networks, which limits their applicability. By reformulating the problem of training binarized neural networks as a subadditive dual of a mixed-integer program, we show that binarized neural networks admit a tame representation. This, in turn, makes it possible to use the framework of Bolte et al. for implicit differentiation, which offers the possibility for practical implementation of backpropagation in the context of binarized neural networks. This approach could also be used for a broader class of mixed-integer programs, beyond the training of binarized neural networks, as encountered in symbolic approaches to AI and beyond.
Paper Structure (15 sections, 6 theorems, 15 equations, 4 figures)

This paper contains 15 sections, 6 theorems, 15 equations, 4 figures.

Key Result

Theorem 1

eq:BNN is equivalent to the following mixed-integer linear program: where $x\coloneqq x^{(0)}$, $u^{(\ell)}\coloneqq x^{(\ell)}$, for $0<\ell\leq L$, $M_1:=(n r+1)$, $\| x\|<r$ a Euclidean norm bound, $n$ the dimension of $x$, and $M_\ell:=\left(d_{\ell-1}+1\right)$. The new variables $s^{(\ell)}_{ij}\in[-1,1]$ have been added to linearize the products $w^{(\ell)}x^

Figures (4)

  • Figure 1: A BNN with $|L|=4$ layers, $L=(2,3,3,2)$. The input vector $\boldsymbol{x}=(x_1,x_2)$ take values in $\mathbb{R}^2$, while the activation functions, $\sigma$, in the following layers compress this to lie in the set $\{0,1\}^{L_\ell}$, for $\ell>0$.
  • Figure 2: An illustrative example of a mixed-integer set as subset of $\mathbb{Z}\times \mathbb{R}$. This is given as the feasible set of the MIP \ref{['mip']}.
  • Figure 3: A graphical representation of the three final layers of the BNN we use as an example. See \ref{['eq:ralphsPrimal']} for the corresponding MILP.
  • Figure 4: The value function, $f(b)$, of the example \ref{['eq:ralphsSubadditive']} together with the simple approximation, $\tilde{f}(b)$, given by \ref{['eq:approx']}.

Theorems & Definitions (14)

  • Theorem 1: MILP formulation
  • Definition 1: Regular cone
  • Definition 2: Subadditive and non-decreasing functions
  • Theorem 2: Thm. 3 of Kocuk2019
  • Definition 3: NT
  • Definition 4: WNT
  • Theorem 3: NT theorem, csiszar1964lim
  • Definition 5: o-minimal structure
  • Definition 6: Locally o-minimal structure
  • Definition 7: Definably Baire, from FORNASIERO2013211
  • ...and 4 more