Vector Symbolic Finite State Machines in Attractor Neural Networks

Madison Cotteret; Hugh Greatorex; Martin Ziegler; Elisabetta Chicca

Vector Symbolic Finite State Machines in Attractor Neural Networks

Madison Cotteret, Hugh Greatorex, Martin Ziegler, Elisabetta Chicca

TL;DR

This work addresses the limitation of Hopfield attractor networks in performing state-dependent transitions by embedding arbitrary Finite State Machines (FSMs) within an attractor framework. States and stimuli are encoded as high-dimensional hypervectors from vector symbolic architectures, with transitions enacted by carefully constructed asymmetric terms and a masking-based input mechanism that supports asynchronous operation. The authors demonstrate both dense bipolar and sparse binary representations, showing linear and near-quadratic scaling of FSM capacity with network size, respectively, and illustrate robustness to noisy or incomplete weights and to asynchronous updates. They argue for biological plausibility and potential neuromorphic implementations, offering a distributed, learnable FSM primitive that could underpin cognitive computation in neural systems.

Abstract

Hopfield attractor networks are robust distributed models of human memory, but lack a general mechanism for effecting state-dependent attractor transitions in response to input. We propose construction rules such that an attractor network may implement an arbitrary finite state machine (FSM), where states and stimuli are represented by high-dimensional random vectors, and all state transitions are enacted by the attractor network's dynamics. Numerical simulations show the capacity of the model, in terms of the maximum size of implementable FSM, to be linear in the size of the attractor network for dense bipolar state vectors, and approximately quadratic for sparse binary state vectors. We show that the model is robust to imprecise and noisy weights, and so a prime candidate for implementation with high-density but unreliable devices. By endowing attractor networks with the ability to emulate arbitrary FSMs, we propose a plausible path by which FSMs could exist as a distributed computational primitive in biological neural networks.

Vector Symbolic Finite State Machines in Attractor Neural Networks

TL;DR

Abstract

Paper Structure (28 sections, 46 equations, 13 figures, 2 tables)

This paper contains 28 sections, 46 equations, 13 figures, 2 tables.

Introduction
Methods
Hypervector arithmetic
Hopfield networks
Finite State Machines
Attractor network construction
Constructing transitions
Edge outputs
Sparse activity states
Results
FSM emulation
Network robustness
Asynchronous updates
Storage capacity
Storage capacity with sparse states
...and 13 more sections

Figures (13)

Figure 1: An example FSM which we implement within the attractor network. Each node within the graph (e.g. "Zeus") is represented by a new hypervector $\mathbf{x}^\mu$ and stored as an attractor within the network. Every edge is labelled by its stimulus (e.g. "father_is"), for which corresponding hypervectors $\mathbf{s}_a$ and $\mathbf{s}_b$ are also generated. When a stimulus' hypervector is input to the network, it should allow all corresponding attractor transitions to take place. Each edge may also have an associated output symbol, where we here choose the edges labelled "type" to output the generation of the god {"Primordial", "Titans", "Olympians"}. This graph was chosen as it displays the generality of the embedding: it contains cycles, loops, bidirectional edges and state-dependent transitions.
Figure 2: An attractor network transitioning through attractor states in a state-dependent manner, as a sequence of input stimuli is presented to the network. a) The input stimuli to the network, where for each unique stimulus (e.g. "father_is" in the FSM to be implemented (Figure \ref{['fig:starwars_net']}) a pair of hypervectors $\mathbf{s}_a$ and $\mathbf{s}_b$ have been generated. No stimulus, a stimulus $\mathbf{s}_a$, then a stimulus $\mathbf{s}_b$ are input for 10 time steps each in sequence. b) & c) The similarity of the network state $\mathbf{z}_t$ to stored node attractor states $\mathbf{x} \in X_{\text{AN}}$ and stored edge states $\mathbf{e}$ respectively, computed via the inner product (Equation \ref{['eqn:d_simple']}). d) The similarity of the network state $\mathbf{z}_t$ to the sparse output states $\mathbf{r} \in R_{\text{AN}}$. All similarities have been labelled with the state they represent and the colours are purely illustrative. The attractor transitions shown here are explicitly state-dependent, as can be seen from the repeated input of the stimulus "father_is", which results in a transition to state "Kronos" when in "Hades", but to "Uranus" when in "Kronos". Additionally, the network is unaffected by nonsense input that does not correspond to a stored edge, as the network remains in the attractor "Uranus" when presented with the stimulus "father_is".
Figure 3: The attractor network performing a walk as in Figure \ref{['fig:walk_standard']}, but using the damaged weights matrix $\mathbf{W}^{\text{noisy}}$, whose entries have been binarised and then independent additive noise has been applied, as per Equation \ref{['eqn:W_construction']}. a) The distribution of weights after they have been thusly damaged with noise of magnitude $\sigma_{\text{noise}} = 2$, corresponding to an SNR of 0dB. Weights whose ideal values were positive or negative have been plotted separately. b) The similarity of the network state $\mathbf{z}_t$ to stored node hypervectors, with the network using the weights from a). Shown above is the sequence of inputs given to the network, identical to in Figure \ref{['fig:walk_standard']}. c) The distribution of weights damaged with $\sigma_{\text{noise}} = 5$, corresponding to an SNR of -0.8dB. d) The similarity of the network state to stored node hypervectors, but with the network using the damaged weights from c). The network transitions are thus highly robust to unreliable weights, and show a gradual degradation in performance, even when the network's weights are majorly imprecise and noisy. For both b) and d) the edge state and output similarity plots have been omitted for visual clarity.
Figure 4: The attractor network performing a walk as in Figure \ref{['fig:walk_standard']}, but using a sparse ternary weights matrix $\mathbf{W}^{\text{sparse}} \in \{-1,0,1\}^{N \times N}$, generated via Equation \ref{['eqn:W_sparse']}. The weights matrices for a) and b) are 98% and 99% sparse respectively. Shown are the similarities of the network state $\mathbf{z}_t$ with stored node hypervectors $\mathbf{x} \in X_{\text{AN}}$, with the applied stimulus hypervector at any time shown above. We see that even when 98% of the entries in $\mathbf{W}$ are zeroes, the network continues to function with negligible loss in stability, as the correct walk between attractor states is performed, and the network converges on stored attractors with similarity $d(\mathbf{z}_t, \mathbf{x}) \approx 1$. At 99% sparsity there is a degradation in the accuracy of stored attractors, with the network converging on states with $d(\mathbf{z}_t, \mathbf{x}) < 1$, but with the correct walk still being performed. Beyond 99% sparsity the attractor dynamics break down (not shown). Thus although requiring a large number of neurons $N$ to enforce state pseudo-orthogonality, the network requires far fewer than $N^2$ nonzero weights to function robustly.
Figure 5: An attractor network performing a shorter walk than in Figure \ref{['fig:walk_standard']}, but where neurons are updated asynchronously, with each neuron having a 10% chance of updating on any time step. a) The similarity of the network state $\mathbf{z}_t$ to stored node hypervectors, with the stimulus hypervectors being applied to the network labelled above. b) The evolution of a subset of neurons within the attractor network, where for visual clarity, three of the node hypervectors have been taken from columns of the $N$-dimensional Hadamard matrix, rather than being randomly generated. The network functions largely the same as in the synchronous case, but with transitions between attractor states now taking a finite number of time steps to complete. The model is thus not dependent on the precise timing of neuron updates, and should function robustly in asynchronous systems where timing is unreliable.
...and 8 more figures

Vector Symbolic Finite State Machines in Attractor Neural Networks

TL;DR

Abstract

Vector Symbolic Finite State Machines in Attractor Neural Networks

Authors

TL;DR

Abstract

Table of Contents

Figures (13)