Emergence of Grid-like Representations by Training Recurrent Networks with Conformal Normalization

Dehong Xu; Ruiqi Gao; Wen-Hao Zhang; Xue-Xin Wei; Ying Nian Wu

Emergence of Grid-like Representations by Training Recurrent Networks with Conformal Normalization

Dehong Xu, Ruiqi Gao, Wen-Hao Zhang, Xue-Xin Wei, Ying Nian Wu

TL;DR

The paper tackles how grid-like, hexagonal representations can emerge in RNN-based navigation models. It introduces conformal normalization, a velocity-modulation mechanism that enforces a 2D conformal embedding of physical space into a high-dimensional neural space, with a scaling factor $s$ and directional derivative $f(\mathbf{v}, \theta)$. The authors develop both linear and nonlinear RNN formulations, analyze the associated group structure and torus topology, and demonstrate hexagonal grid patterns across modular blocks, supported by Fourier analysis and topological evidence. This work provides a principled mechanism for internal GPS-like representations, enabling robust path integration and multi-scale grid modules without requiring extra loss terms, with potential implications for modeling hippocampal grid/place cell dynamics and navigation in artificial agents.

Abstract

Grid cells in the entorhinal cortex of mammalian brains exhibit striking hexagon grid firing patterns in their response maps as the animal (e.g., a rat) navigates in a 2D open environment. In this paper, we study the emergence of the hexagon grid patterns of grid cells based on a general recurrent neural network (RNN) model that captures the navigation process. The responses of grid cells collectively form a high dimensional vector, representing the 2D self-position of the agent. As the agent moves, the vector is transformed by an RNN that takes the velocity of the agent as input. We propose a simple yet general conformal normalization of the input velocity of the RNN, so that the local displacement of the position vector in the high-dimensional neural space is proportional to the local displacement of the agent in the 2D physical space, regardless of the direction of the input velocity. We apply this mechanism to both a linear RNN and nonlinear RNNs. Theoretically, we provide an understanding that explains the connection between conformal normalization and the emergence of hexagon grid patterns. Empirically, we conduct extensive experiments to verify that conformal normalization is crucial for the emergence of hexagon grid patterns, across various types of RNNs. The learned patterns share similar profiles to biological grid cells, and the topological properties of the patterns also align with our theoretical understanding.

Emergence of Grid-like Representations by Training Recurrent Networks with Conformal Normalization

TL;DR

and directional derivative

. The authors develop both linear and nonlinear RNN formulations, analyze the associated group structure and torus topology, and demonstrate hexagonal grid patterns across modular blocks, supported by Fourier analysis and topological evidence. This work provides a principled mechanism for internal GPS-like representations, enabling robust path integration and multi-scale grid modules without requiring extra loss terms, with potential implications for modeling hippocampal grid/place cell dynamics and navigation in artificial agents.

Abstract

Paper Structure (30 sections, 1 theorem, 21 equations, 14 figures, 1 table)

This paper contains 30 sections, 1 theorem, 21 equations, 14 figures, 1 table.

Introduction
Contributions.
Background
Position embedding
Recurrent transformation
Place cells
Conformal normalization
Definition
Conformal 2D manifold
Why high-dimensional ${\bm{v}}$?
Modeling
Linear model
Non-linear recurrent neural networks
Multiple blocks and multi-scale coordinate systems
Learning
...and 15 more sections

Key Result

Proposition 3.3

With conformal normalization (eq:CN0) and (eq:CN1), we have (eq:CI) is called conformal isometry.

Figures (14)

Figure 1: (a) Recorded response maps and autocorrelation maps of four different grid cells (from moser2014network). (b) The self-position ${\bm{x}} = ({x}_1, {x}_2)$ in 2D physical space is represented by a vector ${\bm{v}}({\bm{x}})$ in the $d$-dimensional neural space. When the agent moves by $\Delta {\bm{x}}$, the vector is transformed to ${\bm{v}}({\bm{x}}+\Delta {\bm{x}}) = F({\bm{v}}({\bm{x}}), \Delta {\bm{x}})$. (c) Illustration of basis expansion model $A({\bm{x}}\mid {\bm{x}}') = \sum_{i=1}^{d} u_i({\bm{x}}') v_i({\bm{x}})$, where $v_i({\bm{x}})$ is the response map of $i$-th grid cell, shown at the bottom. $A({\bm{x}} \mid {\bm{x}}')$ is the response map of place cell associated with ${\bm{x}}'$, shown at the top. $u_i({\bm{x}}')$ is the connection weight. All the response maps are generated by our linear model.
Figure 2: (a) $(F(\cdot, \Delta {\bm{x}}), \forall \Delta {\bm{x}} \in \mathbb{R}^2)$ is a group of transformations, and this transformation group is a representation of the 2D additive Euclidean group $(\mathbb{R}^2, +)$. (b) A 2D torus embedded in 3D space (from torus). (c) Square lattice. (d) Hexagon lattice.
Figure 3: Results of linear and non-linear models with GELU and Tanh activation and path integration. (A) Hexagonal grid firing patterns emerge in learned ${\bm{v}}({\bm{x}})$ across all models with conformal normalization. (B) Autocorrelograms of the learned firing patterns. (C) Without the conformal normalization condition, patterns are not hexagon grid-like. (D) Gridness score distribution of all learned grid cells. The dashed line indicates the threshold of successfulness. (E) Multi-modal distribution of grid scales of the learned grid cells, as well as the scale ratios between successive modes.
Figure 4: Results for path integration. (A) Path integration for 30 steps without re-encoding. The black line represents the real trajectory and the red one is the predicted trajectory by the learned model. (B) Results for long-distance (100-step) path integration error with and without re-encoding over time by the non-linear model.
Figure 5: Toroidal structure spectral analysis in the activity of a module of grid cells. (A) Nonlinear dimensionality reduction reveals a torus-like structure in the population activity of learned grid cells. (B) Displays of the mean Fourier power spectral density, where highlighting peaks are arranged in a hexagonal pattern. (C) Projections of the reduced manifold onto three principal directional vectors, $k_1$, $k_2$, and $k_3$, which are depicted as three rings.
...and 9 more figures

Theorems & Definitions (3)

Definition 3.1
Definition 3.2
Proposition 3.3

Emergence of Grid-like Representations by Training Recurrent Networks with Conformal Normalization

TL;DR

Abstract

Emergence of Grid-like Representations by Training Recurrent Networks with Conformal Normalization

Authors

TL;DR

Abstract

Table of Contents

Key Result

Figures (14)

Theorems & Definitions (3)