A Spatiotemporal Perspective on Dynamical Computation in Neural Information Processing Systems

T. Anderson Keller; Lyle Muller; Terrence J. Sejnowski; Max Welling

A Spatiotemporal Perspective on Dynamical Computation in Neural Information Processing Systems

T. Anderson Keller, Lyle Muller, Terrence J. Sejnowski, Max Welling

TL;DR

It is shown that spatiotemporal dynamics may be a mechanism by which natural neural systems encode approximate visual, temporal, and abstract symmetries of the world as conserved quantities, thereby enabling improved generalization and long-term working memory.

Abstract

Spatiotemporal flows of neural activity, such as traveling waves, have been observed throughout the brain since the earliest recordings; yet there is still little consensus on their functional role. Recent experiments and models have linked traveling waves to visual and physical motion, but these observations have been difficult to reconcile with standard accounts of topographically organized selectivity and feedforward receptive fields. Here, we introduce a theoretical framework that formalizes and generalizes the connection between 'motion' and flowing neural dynamics in the language of equivariant neural network theory. We consider 'motion' not only in physical or visual spaces, but also in more abstract representational spaces, and we argue that recurrent traveling-wave-like dynamics are not just useful but necessary for accurate and stable processing of any signal undergoing such motion. Formally, we show that for any non-trivial recurrent neural network to process a sequence undergoing a flow transformation (such as visual motion) in a structured equivariant manner, its hidden state dynamics must actively realize a homomorphic representation of the same flow through recurrent connectivity. In this ''spatiotemporal perspective on dynamical computation'', traveling waves and related flows are best understood as faithful dynamic representations of stimulus flows; and consequently the natural inclination of biological systems towards such dynamics may be viewed as an innate inductive bias towards efficiency and generalization in the spatiotemporally-structured dynamical world they inhabit.

A Spatiotemporal Perspective on Dynamical Computation in Neural Information Processing Systems

TL;DR

Abstract

Paper Structure (24 sections, 6 equations, 6 figures)

This paper contains 24 sections, 6 equations, 6 figures.

Introduction
Representation Learning in a Dynamic Environment
Flow Equivariance: Properly Representing Motion Requires Latent Flows
Spatiotemporal Neural Dynamics are Natural Latent Flows
Spatiotemporal Dynamics as an Inductive Bias of the Biological Substrate
Flow Equivariance as Memory
A Unifying Perspective
Discussion
Conclusions
Acknowledgements
Author contributions statement

Figures (6)

Figure 1: Equivariant feature extractors $\phi(\mathbf{u})$ are those that commute with a given transformation of interest and, in doing so, they preserve the structure of the space they act on. In this way, the notion of space is expanded to include the space of arbitrary transformations and one can naturally describe 'motion' in that space, just like moving in physical space. In the left figure above, we depict a convolutional feature extractor $\phi$ applying an edge detection kernel to a simple image $\mathbf{u}$. We see that regardless of whether we first move the object in image (through the spatial translation operation $g \cdot \mathbf{u}$) and then convolve ($\phi(g \cdot \mathbf{u})$), or if we first convolve the image $\phi(\mathbf{u})$ and then translate the output, the result is unchanged, i.e. $\phi(g \cdot \mathbf{u}) = g \cdot \phi(\mathbf{u})$. This is due to the equivariance of the convolution operation with respect to the translation group. However, traditional convolutional neural networks are not by default equivariant with respect to the group of 90-degree rotations. In the right figure, we depict a rotation-equivariant convolutioncohen2016group where the output now exists in a 'lifted' rotation space (the vertical axis), constructed by convolving the input with rotated copies of the original kernel. We see in this setting, when rotation is applied to the input, the output transformation takes the structured form of cyclically permuting the activations through the lifted space combined with the usual spatial rotation. The model thus has a structured latent space with respect to translation and rotation.
Figure 2: A flow equivariant neural network will process a moving stimulus sequence the same way as it processes a static stimulus sequence, but with the same motion applied to its hidden state. To achieve this property for a specific input flow $\psi^{(\nu)}$, a corresponding flow must preemptively shift the hidden state before the input comes in (denoted by red arrows). We show (left) the hidden state of a simple RNN $(\mathbf{h}_{t+1} = \mathbf{h}_t + \phi(\mathbf{u}_t))$ processing a static image. For flow equivariance to hold, the hidden state of the network processing a moving image should be a moving version of this output. We see (middle) that a flow equivariant RNN $(\mathbf{h}_{t+1} = \psi_1 \cdot \mathbf{h}_t + \phi(\mathbf{u}_t))$ does indeed achieve this property -- the hidden state flows in unison with the input flow (denoted by the red arrows in the feature space). On the right, we see that a regular simple RNN does not satisfy this property, since the hidden state at each timestep appears to 'lag behind' the moving input, leading to a blurring of the hidden state where the input and the hidden state do not add constructively as they did for the static case.
Figure 3: Illustration of the fundamental mathematical difference between standing waves and traveling waves, which are not 'spacetime-separable.' For traveling waves, the function $u(x, t)$ that represents the displacement of the field at a spatial location $x$, and individual point in time $t$, cannot be decomposed into two independent functions of space and time; that is, there do not exist functions $a(x)$ and $b(t)$ such that $u(x, t) = a(x)b(t)$. In particular, the first-order wave equation in the right panel $\frac{\partial u}{\partial t} = \nu \frac{\partial u}{\partial x}$ admits the non-separable general solution: $u(x, t) = f(x - \nu t)$. In the panel on the left, standing waves, the stationary counterpart of traveling waves (sometimes called oscillations), are clearly separable into spatial and temporal components: $u(x, t) = \sin(k x)\sin(\omega t)$. Importantly, generic non-trivial motion transformations ('flows') are inherently spacetime inseparable, meaning that if a neural network would like to represent them consistently, its dynamics must also be spacetime inseparable.
Figure 4: Biological neural systems have inherent time delays that constrain control and communication. At the most fundamental level, any physical system in which time-delays are non-negligible, i.e. large relative to the characteristic timescale of the dynamics, will exhibit spatiotemporal structureROXIN2011323Campbell2007Bressloff_2012. In the brain, this is dependent on the interplay of conduction velocity and the neural membrane time constant. Specifically, the membrane time constant, the characteristic time of the exponential decay of a neuron back to its resting state potential, is on the order of 10 msEyal2016UniqueMembraneProperties, while local traveling waves have been measured to propagate at roughly 0.1 to 0.8 m/s Muller2018. The spatiotemporal organization of neural activity into waves then makes sense as a result of this timescale imbalance and the distances signals must travel.
Figure 5: A visualization of input symmetry transformations and the corresponding spatiotemporal dynamics in a spatially organized equivariant feature space. By arranging features according to symmetry transformations, the input can be represented by a smooth shift of activity in the spatially organized feature space. In the simplest case of rotation (left), we see that the corresponding representation will flow as a circular wave between orientation selective cells. For more complex abstract symmetry transformations, such as the changing of the pose of a horse as it prepares for a jump, we see that activity may still flow locally and smoothly in space when organized properly. Importantly, it is the local connectivity, laid out along the cortical surface, that induces this local shift operation, leading to structured representation of the abstract symmetries. We highlight that this is a highly idealized abstraction, and that not all transformations can be laid out perfectly in a 2D grid, and that a highly intricate optimization problem must be solved to figure out the ideal 'organization' and feature basis with which inputs are represented; yet, the idea that spatial organization of selectivity is related to lateral connectivity and structured by spatiotemporal dynamics has been supported by both models nwm and theory zucker.
...and 1 more figures

A Spatiotemporal Perspective on Dynamical Computation in Neural Information Processing Systems

TL;DR

Abstract

A Spatiotemporal Perspective on Dynamical Computation in Neural Information Processing Systems

Authors

TL;DR

Abstract

Table of Contents

Figures (6)