Table of Contents
Fetching ...

Learning with Boolean threshold functions

Veit Elser, Manish Krishan Lal

TL;DR

This work reframes neural-network learning on Boolean data as a constraint-satisfaction problem between two complementary projections, A and B, implemented via the divide-and-concur framework and solved by the RRR projection algorithm. Boolean threshold functions with margin constraints μ_m promote sparse, ±1-weight representations that correspond to simple logic gates, enabling high interpretability and potential Boolean hardware efficiency. Across multipliers, binary autoencoding, MNIST-derived tasks, logic circuits, and cellular automata, the approach achieves exact or near-exact solutions and strong generalization in regimes where gradient-based learning struggles. The results demonstrate that constraint-based learning offers a conceptually distinct, scalable path for discrete neural systems with practical implications for interpretability and efficient inference.

Abstract

We develop a method for training neural networks on Boolean data in which the values at all nodes are strictly $\pm 1$, and the resulting models are typically equivalent to networks whose nonzero weights are also $\pm 1$. The method replaces loss minimization with a nonconvex constraint formulation. Each node implements a Boolean threshold function (BTF), and training is expressed through a divide-and-concur decomposition into two complementary constraints: one enforces local BTF consistency between inputs, weights, and output; the other imposes architectural concurrence, equating neuron outputs with downstream inputs and enforcing weight equality across training-data instantiations of the network. The reflect-reflect-relax (RRR) projection algorithm is used to reconcile these constraints. Each BTF constraint includes a lower bound on the margin. When this bound is sufficiently large, the learned representations are provably sparse and equivalent to networks composed of simple logical gates with $\pm 1$ weights. Across a range of tasks -- including multiplier-circuit discovery, binary autoencoding, logic-network inference, and cellular automata learning -- the method achieves exact solutions or strong generalization in regimes where standard gradient-based methods struggle. These results demonstrate that projection-based constraint satisfaction provides a viable and conceptually distinct foundation for learning in discrete neural systems, with implications for interpretability and efficient inference.

Learning with Boolean threshold functions

TL;DR

This work reframes neural-network learning on Boolean data as a constraint-satisfaction problem between two complementary projections, A and B, implemented via the divide-and-concur framework and solved by the RRR projection algorithm. Boolean threshold functions with margin constraints μ_m promote sparse, ±1-weight representations that correspond to simple logic gates, enabling high interpretability and potential Boolean hardware efficiency. Across multipliers, binary autoencoding, MNIST-derived tasks, logic circuits, and cellular automata, the approach achieves exact or near-exact solutions and strong generalization in regimes where gradient-based learning struggles. The results demonstrate that constraint-based learning offers a conceptually distinct, scalable path for discrete neural systems with practical implications for interpretability and efficient inference.

Abstract

We develop a method for training neural networks on Boolean data in which the values at all nodes are strictly , and the resulting models are typically equivalent to networks whose nonzero weights are also . The method replaces loss minimization with a nonconvex constraint formulation. Each node implements a Boolean threshold function (BTF), and training is expressed through a divide-and-concur decomposition into two complementary constraints: one enforces local BTF consistency between inputs, weights, and output; the other imposes architectural concurrence, equating neuron outputs with downstream inputs and enforcing weight equality across training-data instantiations of the network. The reflect-reflect-relax (RRR) projection algorithm is used to reconcile these constraints. Each BTF constraint includes a lower bound on the margin. When this bound is sufficiently large, the learned representations are provably sparse and equivalent to networks composed of simple logical gates with weights. Across a range of tasks -- including multiplier-circuit discovery, binary autoencoding, logic-network inference, and cellular automata learning -- the method achieves exact solutions or strong generalization in regimes where standard gradient-based methods struggle. These results demonstrate that projection-based constraint satisfaction provides a viable and conceptually distinct foundation for learning in discrete neural systems, with implications for interpretability and efficient inference.
Paper Structure (22 sections, 3 theorems, 79 equations, 21 figures)

This paper contains 22 sections, 3 theorems, 79 equations, 21 figures.

Key Result

Lemma 3.3

If $w$ is admissible and $\mathop{\mathrm{supp}}\nolimits(w)=m$, then

Figures (21)

  • Figure 1: Two 5-layer random logic circuits where each node either copies a value from the layer below (isolated colored lines) or applies a logical gate to them (colored groups). The gates are either 2-input And/Or (left circuit) or 3-input Maj (right circuit). The colors (red/blue) of the highlighted edges indicate whether or not negation is applied. The blue node in the left circuit means this BTF is taking input from the constant node with weight $+1$, thereby choosing to implement And.
  • Figure 2: Evolution of the training and test accuracies when training on 128, 256, 384, and 512 data generated by the random logic networks above. Learning the random And/Or data (left plot) is easier, but 256 data appear to be sufficient to learn both types of data, given sufficient iterations of the algorithm.
  • Figure 3: Same evolution of learning curves as in Figure \ref{['fig:accplots']} but for the gradient-descent method.
  • Figure 4: Distribution of network weights after training on the two types of random logic data (left: And/Or, right: Maj). The lower histograms show the weights in just the first layer of each network.
  • Figure 5: Divide-and-concur variable groupings in BTF networks. Red lines are neuron inputs ($x$), blue disks are neuron outputs ($y$), and green lines are weights ($w$). On the left, the highlighted fan-in shows all the variables in one BTF constraint. On the right, the highlighted fan-out show the concur between neuron outputs and inputs.
  • ...and 16 more figures

Theorems & Definitions (8)

  • Definition 3.1
  • Definition 3.2
  • Lemma 3.3
  • proof
  • Theorem 3.4
  • proof
  • Lemma 3.5
  • proof