Table of Contents
Fetching ...

Neuromorphic Visual Scene Understanding with Resonator Networks

Alpha Renner, Lazar Supic, Andreea Danielescu, Giacomo Indiveri, Bruno A. Olshausen, Yulia Sandamirskaya, Friedrich T. Sommer, E. Paxon Frady

TL;DR

A neural network model, the hierarchical resonator, is developed to determine the generative factors of variation of objects in simple scenes, implemented on neuromorphic hardware using spike-timing code for complex numbers.

Abstract

Analyzing a visual scene by inferring the configuration of a generative model is widely considered the most flexible and generalizable approach to scene understanding. Yet, one major problem is the computational challenge of the inference procedure, involving a combinatorial search across object identities and poses. Here we propose a neuromorphic solution exploiting three key concepts: (1) a computational framework based on Vector Symbolic Architectures (VSA) with complex-valued vectors; (2) the design of Hierarchical Resonator Networks (HRN) to factorize the non-commutative transforms translation and rotation in visual scenes; (3) the design of a multi-compartment spiking phasor neuron model for implementing complex-valued resonator networks on neuromorphic hardware. The VSA framework uses vector binding operations to form a generative image model in which binding acts as the equivariant operation for geometric transformations. A scene can, therefore, be described as a sum of vector products, which can then be efficiently factorized by a resonator network to infer objects and their poses. The HRN features a partitioned architecture in which vector binding is equivariant for horizontal and vertical translation within one partition and for rotation and scaling within the other partition. The spiking neuron model allows mapping the resonator network onto efficient and low-power neuromorphic hardware. Our approach is demonstrated on synthetic scenes composed of simple 2D shapes undergoing rigid geometric transformations and color changes. A companion paper demonstrates the same approach in real-world application scenarios for machine vision and robotics.

Neuromorphic Visual Scene Understanding with Resonator Networks

TL;DR

A neural network model, the hierarchical resonator, is developed to determine the generative factors of variation of objects in simple scenes, implemented on neuromorphic hardware using spike-timing code for complex numbers.

Abstract

Analyzing a visual scene by inferring the configuration of a generative model is widely considered the most flexible and generalizable approach to scene understanding. Yet, one major problem is the computational challenge of the inference procedure, involving a combinatorial search across object identities and poses. Here we propose a neuromorphic solution exploiting three key concepts: (1) a computational framework based on Vector Symbolic Architectures (VSA) with complex-valued vectors; (2) the design of Hierarchical Resonator Networks (HRN) to factorize the non-commutative transforms translation and rotation in visual scenes; (3) the design of a multi-compartment spiking phasor neuron model for implementing complex-valued resonator networks on neuromorphic hardware. The VSA framework uses vector binding operations to form a generative image model in which binding acts as the equivariant operation for geometric transformations. A scene can, therefore, be described as a sum of vector products, which can then be efficiently factorized by a resonator network to infer objects and their poses. The HRN features a partitioned architecture in which vector binding is equivariant for horizontal and vertical translation within one partition and for rotation and scaling within the other partition. The spiking neuron model allows mapping the resonator network onto efficient and low-power neuromorphic hardware. Our approach is demonstrated on synthetic scenes composed of simple 2D shapes undergoing rigid geometric transformations and color changes. A companion paper demonstrates the same approach in real-world application scenarios for machine vision and robotics.
Paper Structure (22 sections, 12 equations, 8 figures, 2 tables)

This paper contains 22 sections, 12 equations, 8 figures, 2 tables.

Figures (8)

  • Figure 1: Resonator network for inferring shape, color, and translation. A. A synthetic scene and the generative VSA representation. B. A resonator module. C. Encoding and communication in the resonator network. D. Confusion matrix on translation benchmark task with a single object. The overall performance of the network is 98.4%. E. The weighted factor estimates in each resonator module. The maximum value is taken as the output. F. The four dynamic estimates in the resonator network are each visualized as a heatmap, with time represented vertically and each component represented horizontally. After several iterations, the resonator network converges to a solution and remains stable (first row corresponds to panel E). The decoded output is visualized to the right of each row. The object is then 'explained away'. The resonator network is reset and converges to another solution, which describes a different object in the scene (rows 2 and 3).
  • Figure 2: Resonator network for rotation and scale. A. Translation in log-polar space results in rotation and scaling in Cartesian space. B. Diagram of resonator network for inferring shape, rotation, and scaling of input images. C. Example of network dynamics. D. Symmetries of the template lead to ambiguous factorizations. Two examples are shown with different random initializations. The resonator network will converge to one of the ambiguous factorizations (letters 'b' or 'q').
  • Figure 3: The hierarchical resonator network for inferring rigid transforms. A. Schematic diagram of the hierarchical resonator network. B. The dynamics of the resonator network identifying objects in the input scene.
  • Figure 4: Local minima in the hierarchical resonator network. A. Input image. B, C. Two incorrect runs of the network are visualized. D. Confusion matrix of object classes. E. Correlations between the incorrect explanations and the input are on par with the correct explanations (right panel).
  • Figure 5: The resonator network on the Loihi neuromorphic hardware. A. Schematic of the resonator architecture implemented on Loihi. B. Implementation of the (un-)binding module as a 4-compartment neuron on Loihi. C. Mechanism of the phase shift. Here, the soma membrane potential is inhibited by 2, so it will reach the threshold two timesteps later. The inset at the top shows the equation of complex multiplication, its phasor representation, and the corresponding spike timing in phasor I&F neurons. D. Mechanism of the cleanup module. Top: Phasor representation of the complex matrix multiplication of h with the cleanup matrix $\mathbf{H} \mathbf{H}^\dagger$. Bottom: The same mechanism with I&F phasor neurons. E. Mechanism of the complex adder with I&F phasor neurons. The neuron receives two inputs at different phases (orange and green). The current gets integrated into the membrane potential, which approximates a sine wave (red). In blue, the membrane potential and input current of the Loihi neuron are shown. F. States of the resonator on Loihi over 40 iterations and reconstructed image from resonator states.
  • ...and 3 more figures