Table of Contents
Fetching ...

Property Testing of Computational Networks

Artur Czumaj, Christian Sohler

TL;DR

This work introduces a property-testing framework for weighted computational networks, notably ReLU networks, where testers access network weights via queries and aim to distinguish networks that compute functions with a given property from networks far from any such function. It develops concrete testers for simple functions (constant 0 and OR) in the one-hidden-layer setting and proves distribution-free lower bounds illustrating inherent sublinear limitations in that model. The paper then extends the framework to networks with multiple outputs and multiple hidden layers, providing both reductions to single-output testers and deep-network testers for near-constant functions, along with structural results that any network is close to either the 0-function or the OR-function under certain parameters. It also explores monotone properties, monotone generators, and the relationship to monotone DNFs, offering general tester constructions with complexity that scales polylogarithmically with generator size. Overall, the work lays a foundation for understanding how local network structure interacts with global properties, reveals limits of distribution-free testing in this setting, and outlines promising directions for testing more complex neural architectures and function classes.

Abstract

In this paper we initiate the study of \emph{property testing of weighted computational networks viewed as computational devices}. Our goal is to design property testing algorithms that for a given computational network with oracle access to the weights of the network, accept (with probability at least $\frac23$) any network that computes a certain function (or a function with a certain property) and reject (with probability at least $\frac23$) any network that is \emph{far} from computing the function (or any function with the given property). We parameterize the notion of being far and want to reject networks that are \emph{$(ε,δ)$-far}, which means that one needs to change an $ε$-fraction of the description of the network to obtain a network that computes a function that differs in at most a $δ$-fraction of inputs from the desired function (or any function with a given property). To exemplify our framework, we present a case study involving simple neural Boolean networks with ReLU activation function. As a highlight, we demonstrate that for such networks, any near constant function is testable in query complexity independent of the network's size. We also show that a similar result cannot be achieved in a natural generalization of the distribution-free model to our setting, and also in a related vanilla testing model.

Property Testing of Computational Networks

TL;DR

This work introduces a property-testing framework for weighted computational networks, notably ReLU networks, where testers access network weights via queries and aim to distinguish networks that compute functions with a given property from networks far from any such function. It develops concrete testers for simple functions (constant 0 and OR) in the one-hidden-layer setting and proves distribution-free lower bounds illustrating inherent sublinear limitations in that model. The paper then extends the framework to networks with multiple outputs and multiple hidden layers, providing both reductions to single-output testers and deep-network testers for near-constant functions, along with structural results that any network is close to either the 0-function or the OR-function under certain parameters. It also explores monotone properties, monotone generators, and the relationship to monotone DNFs, offering general tester constructions with complexity that scales polylogarithmically with generator size. Overall, the work lays a foundation for understanding how local network structure interacts with global properties, reveals limits of distribution-free testing in this setting, and outlines promising directions for testing more complex neural architectures and function classes.

Abstract

In this paper we initiate the study of \emph{property testing of weighted computational networks viewed as computational devices}. Our goal is to design property testing algorithms that for a given computational network with oracle access to the weights of the network, accept (with probability at least ) any network that computes a certain function (or a function with a certain property) and reject (with probability at least ) any network that is \emph{far} from computing the function (or any function with the given property). We parameterize the notion of being far and want to reject networks that are \emph{-far}, which means that one needs to change an -fraction of the description of the network to obtain a network that computes a function that differs in at most a -fraction of inputs from the desired function (or any function with a given property). To exemplify our framework, we present a case study involving simple neural Boolean networks with ReLU activation function. As a highlight, we demonstrate that for such networks, any near constant function is testable in query complexity independent of the network's size. We also show that a similar result cannot be achieved in a natural generalization of the distribution-free model to our setting, and also in a related vanilla testing model.

Paper Structure

This paper contains 88 sections, 56 theorems, 130 equations, 4 figures, 8 algorithms.

Key Result

Theorem 1

Let $(A,w)$ be a ReLU network with $n$ input nodes, $m$ hidden layer nodes, and a single output. Let $\delta \ge e^{-n/16}$, $\frac{1}{m} < \varepsilon\xspace < \frac{1}{2}$, and $0 < \lambda < \frac{1}{2}$.

Figures (4)

  • Figure 1: A ReLU network with $n$ input nodes with inputs $x_1, \dots, x_n$, a single hidden layer with $m$ hidden layer nodes with values $y_1, \dots, y_m$, and one output node. Every input node $x_i \in \{0,1\}$ is connected to every hidden layer node $y_j$ by an edge with real weight $a_{ij} \in [-1,1]$ and every hidden layer node $y_j$ is connected to the output node by an edge of real weight $w_j \in [-1,1]$. The value of node $y_j$ is $\sum_{i=1}^na_{ij} x_i$, which after applying the ReLU activation function gives $\mathop{\mathrm{ReLU}}\limits(\sum_{i=1}^na_{ij} x_i)$. The Boolean function computed by the network is equal to $\mathrm{sgn}\left(\mathop{\mathrm{ReLU}}\limits\left(\sum_{j=1}^m \left(w_j \cdot \mathop{\mathrm{ReLU}}\limits(\sum_{i=1}^na_{ij} x_i)\right)\right)\right) \equiv \mathrm{sgn}\left(\mathop{\mathrm{ReLU}}\limits(w^T \cdot \mathop{\mathrm{ReLU}}\limits(Ax))\right)$.
  • Figure 2: Construction of the ReLU network $(A_L,w_L)$ used in \ref{['corollary:hardness-vanilla']}.
  • Figure 3: Construction of the networks used by the distributions $\mathcal{N}\xspace_1$ and $\mathcal{N}\xspace_2$. Notice that the $\frac{n}{2}$ nodes in $P$ have positive contribution to the output while the $\frac{n}{2}$ nodes in $N$ have negative contribution.
  • Figure 4: A ReLU network $(W_0, \dots, W_{\ell})$ with $\ell+2$ layers: layer zero $V_0$ containing $m_0$input nodes, $\ell$hidden layers$1, \dots, \ell$, with layer $1 \le k \le \ell$ containing $m_k$ nodes $V_k$, and layer $(\ell+1)$ containing $m_{\ell+1}$output nodes$V_{\ell+1}$. The network has weighted edges connecting pairs of consecutive layers. For every $0 \le k \le \ell$ and for every $1 \le i \le m_k$ and $1 \le j \le m_{k+1}$, there is an edge with weight $W_k[j,i] \in [-1,1]$ connecting the $i$-th node at layer $k$ with the $j$-th node at layer $k+1$. For a given input $x = (x_1, \dots, x_{m_0})^T \in \{0,1\}^{m_0}$, for any node $j$ at layer $0$, the value of that node $\mathrm{val}^0_j(x) = x_j$ (which is also the $j$-th row of $f_0(x)$). For any node $j$ at layer $1 \le k \le \ell+1$, the value of that node is $\mathrm{val}^k_j(x) = \sum_{i=1}^{m_{k-1}} W_{k-1}[j,i] \cdot \mathop{\mathrm{ReLU}}\limits(\mathrm{val}^{k-1}_i(x))$; notice that this is the $j$-th row of $f_k(x)$. For any node $j$ at layer $\ell+1$ (i.e., for the $j$-th output node), the binary output computed by that node equals $\mathrm{output}_j(x) = \mathrm{sgn}(\mathop{\mathrm{ReLU}}\limits(\mathrm{val}^{\ell+1}_j(x)))$; notice that this is the $j$-th row of $f(x)$.

Theorems & Definitions (136)

  • Definition 2.1: ReLU network with one hidden layer and single output
  • Remark 2.2
  • Definition 2.3: ReLU network being far from a property of functions
  • Remark 2.4
  • Definition 2.5
  • Theorem 1
  • Theorem 2
  • Theorem 3
  • Theorem 4
  • Theorem 5
  • ...and 126 more