Table of Contents
Fetching ...

A Grid Cell-Inspired Structured Vector Algebra for Cognitive Maps

Sven Krausse, Emre Neftci, Friedrich T. Sommer, Alpha Renner

TL;DR

The paper addresses how grid-cell representations can span physical and abstract spaces by proposing GC-VSA, a unified neuro-symbolic framework that merges CAN-inspired grid-cell dynamics with Vector Symbolic Architectures. It introduces a structured 3D-block-code, fractional binding (FPE) and a rotation-capable encoding to produce hexagonal receptive fields across multiple scales and orientations. The approach is validated through path integration, spatio-temporal scene representation, and analogical reasoning on family trees, demonstrating both spatial and symbolic capabilities within a single representation. This framework offers neuromorphic-friendly, interpretable mechanisms for integrated spatial and symbolic computation with potential implications for neuroscience, robotics, and AI systems.

Abstract

The entorhinal-hippocampal formation is the mammalian brain's navigation system, encoding both physical and abstract spaces via grid cells. This system is well-studied in neuroscience, and its efficiency and versatility make it attractive for applications in robotics and machine learning. While continuous attractor networks (CANs) successfully model entorhinal grid cells for encoding physical space, integrating both continuous spatial and abstract spatial computations into a unified framework remains challenging. Here, we attempt to bridge this gap by proposing a mechanistic model for versatile information processing in the entorhinal-hippocampal formation inspired by CANs and Vector Symbolic Architectures (VSAs), a neuro-symbolic computing framework. The novel grid-cell VSA (GC-VSA) model employs a spatially structured encoding scheme with 3D neuronal modules mimicking the discrete scales and orientations of grid cell modules, reproducing their characteristic hexagonal receptive fields. In experiments, the model demonstrates versatility in spatial and abstract tasks: (1) accurate path integration for tracking locations, (2) spatio-temporal representation for querying object locations and temporal relations, and (3) symbolic reasoning using family trees as a structured test case for hierarchical relationships.

A Grid Cell-Inspired Structured Vector Algebra for Cognitive Maps

TL;DR

The paper addresses how grid-cell representations can span physical and abstract spaces by proposing GC-VSA, a unified neuro-symbolic framework that merges CAN-inspired grid-cell dynamics with Vector Symbolic Architectures. It introduces a structured 3D-block-code, fractional binding (FPE) and a rotation-capable encoding to produce hexagonal receptive fields across multiple scales and orientations. The approach is validated through path integration, spatio-temporal scene representation, and analogical reasoning on family trees, demonstrating both spatial and symbolic capabilities within a single representation. This framework offers neuromorphic-friendly, interpretable mechanisms for integrated spatial and symbolic computation with potential implications for neuroscience, robotics, and AI systems.

Abstract

The entorhinal-hippocampal formation is the mammalian brain's navigation system, encoding both physical and abstract spaces via grid cells. This system is well-studied in neuroscience, and its efficiency and versatility make it attractive for applications in robotics and machine learning. While continuous attractor networks (CANs) successfully model entorhinal grid cells for encoding physical space, integrating both continuous spatial and abstract spatial computations into a unified framework remains challenging. Here, we attempt to bridge this gap by proposing a mechanistic model for versatile information processing in the entorhinal-hippocampal formation inspired by CANs and Vector Symbolic Architectures (VSAs), a neuro-symbolic computing framework. The novel grid-cell VSA (GC-VSA) model employs a spatially structured encoding scheme with 3D neuronal modules mimicking the discrete scales and orientations of grid cell modules, reproducing their characteristic hexagonal receptive fields. In experiments, the model demonstrates versatility in spatial and abstract tasks: (1) accurate path integration for tracking locations, (2) spatio-temporal representation for querying object locations and temporal relations, and (3) symbolic reasoning using family trees as a structured test case for hierarchical relationships.

Paper Structure

This paper contains 6 sections, 16 equations, 5 figures.

Figures (5)

  • Figure 1: Schematic of the proposed GC-VSA.a: Geometric analogy of the binding operation with FPE encoded vectors: Here, binding corresponds to vector addition in 2D Euclidean space. b: Visualization of how the 3D module activation is constructed from the outer sum of 3 cosines. Depending on the three phases along the three axes $u,v,w$, a cosine grating with the corresponding phase shift is used along the corresponding axis, as indicated by the heatmap at each of the three sides of the cube. The activity at each point in the cube is then computed as the outer sum of these three cosine gratings. c: Schematic of the encoding scheme and the binding operation. Encoding tensors consist of a set of cubes with varying orientations and scales (i.e., relative angles and vector lengths between the hexagonal (pink) and cartesian (black) coordinate system in panel d). Each tensor consists of $n_\theta \times n_s$ modules ($n_\theta=2, n_s=3$ is visualized here, $n_\theta=23, n_s=5$ in the simulations), and each module consists of $n \times n \times n$ neurons (here $n=3$). The activity pattern of each module is shown in panel b. This activity pattern is discretized by an equidistantly spaced grid of the $n \times n \times n$ neurons, as visualized in the middle modules for each vector. The binding operation functions as a circular convolution for each module of the same scale and orientation. This can be seen in the activity patterns of the three modules, where $V_3$ is the activity pattern resulting from a 3D circular convolution of the activity pattern of $V_1$ by the activity pattern in $V_2$. d: Mapping of 2D space with one periodic 3D module. The image shows the 2D projection of the activity of a single 3D module, as shown in panel b. The three axes of the cube ($u,v,w$, pink) in panel b are projected to 3 hexagonal axes in 2D space that are $120^\circ$ apart. Each hexagon shows an identical repeated activity pattern (as it is derived from the same module). The scale and orientation of the hexagonal coordinates ($u,v,w$) with respect to the cartesian coordinates ($x,y$) depends on the scale and rotation of the 3D module we use to project onto the 2D plane. Intuitively, to encode position $V_1$ (green), the activity pattern is shifted to have a bump at the tip of the vector. When this is done for all modules, they constructively interfere at this location and destructively at other locations (see Fig. \ref{['fig:phase_kernel_explainer']}c) mcnaughton_2006.
  • Figure 2: The generator phases that span the encoding produce different receptive fields in different VSAs.a: Generator phases determining spatial frequencies across modules (see Fig.1 of Frady et al. frady_2021). Each point describes one module with a specific scale and orientation. The scale and orientation or spatial frequency of a module is defined by the two spatial phases of the two (or three) generator phases that span the space. In the context of 2D space representation, they can, however, also be regarded as a feature of each module of the structured encoding scheme. The three red dots are drawn to remind us that each blue dot actually corresponds to a triplet of generator phases, one for each of the three axes. b: Example basis functions or receptive fields for different modules. Each basis function shows a hexagonal grid pattern mcnaughton_2006. The periodicity and rotation of the pattern are defined by each module's scale and orientation, as shown in a. c: Intuition for the cosine similarity kernel by stacking and adding the periodic basis functions mcnaughton_2006. The constructive interference in the center produces a peak in the form of a 2D sinc function. d: Three example gratings that describe receptive fields of FHRR phasors with FPE encoding. Only the addition of 3 complex neurons yields hexagonal receptive fields. e: Resulting receptive fields through the addition of three cosine gratings. Depending on the phase relationship between the three encodings, the receptive field of a neuron in the block-code is either of the two shown periodic patterns. Both types of receptive fields are found in each module for different neurons.
  • Figure 3: Rotation making use of the structure of the GC-VSA encoding. a: A single 2D position is encoded into a hyperdimensional vector using FPE. The green color shows the similarity of the encoded vector with the codebook vector at each location, and the red vector shows the encoded point. b: Using the rotation operation described in Eq. \ref{['eq:rotation']}, the vector of panel a is rotated by angle $\alpha$. The true rotated vector is shown in red, and the green color again shows the similarity of the VSA vector to the location codebook. c: Rotation decoding: by circular convolution along the rotation axis of the rotated vector with the flipped (analogous to the complex conjugate) version of the other vector, one can read out the angle between the vectors. Here, the rotation vector $\alpha$ is approximately recovered. The green dashed line shows the readout angle.
  • Figure 4: Path integration and Scene representation.a: Trajectory decoding for path integration. The blue line represents the true path, the orange line shows the decoded trajectory. The green colormap indicates the cosine similarity of the encoded position vector with the codebook. Each vector state encodes the allocentric position, so we can easily get the vector back to the origin (or any other position) at any point in time (here shown as a black arrow). b: Visualization of a spatio-temporal scene containing five objects at four points in time. The scene is encoded into a single vector, each object encoded with spatial, temporal, and identity features. c: Decoding with the resonator network. Iterative decoding of object position, time and identity from the bundled scene vector. The color describes the similarity between the estimated vector and the codebook vectors over multiple iterations (vertical axis). Yellow color means high similarity and blue low similarity.
  • Figure 5: Analogical reasoning on a family tree.a: Encoding scheme of the family tree, using the two base vectors $l$ and $r$ to navigate on a binary tree and using the permutation operation $p(\cdot)$ account for the depth of the tree. b: The two encoded trees of family relations. Nodes represent family members, and edges relationships. c: Analogical reasoning: Decoding the analog of 'Charles' from tree $F_A$ in tree $F_B$ using cosine similarity. d: Low-dimensional intuition of the spatial computation in the analogy task. The green arrow represents the mapping vector $M_{AB}$, applied to identify analogs in the corresponding tree.