Place Cells as Multi-Scale Position Embeddings: Random Walk Transition Kernels for Path Planning
Minglu Zhao, Dehong Xu, Deqian Kong, Wen-Hao Zhang, Ying Nian Wu
TL;DR
The paper models hippocampal place cells as a population of non-negative position embeddings $h(x,\\tau)$ derived from the spectral decomposition of multi-step symmetric random-walk transition kernels, with $\\langle h(x,\\tau), h(y,\\tau)\\rangle = q(y|x,\\tau)$. The time-scale parameter $\\sqrt{\\tau}$ defines a multi-scale, Euclideanized cognitive map, where emergent sparsity arises from non-negativity and orthogonality constraints, naturally explaining localized place fields. Global spatial relationships are built efficiently through matrix squaring ($P_{2\\tau}=P_\\tau^2$), enabling trap-free gradient-based path planning with adaptive scale selection, and theta phase is linked to the angle structure within the population embeddings. Experiments in open-field and obstacle-rich mazes demonstrate accurate reproduction of transition kernels, robust navigation across scales, and topology-driven remapping, supporting a biologically plausible, scalable framework that connects diffusion theory, cognitive maps, and neural population coding.
Abstract
The hippocampus supports spatial navigation by encoding cognitive maps through collective place cell activity. We model the place cell population as non-negative spatial embeddings derived from the spectral decomposition of multi-step random walk transition kernels. In this framework, inner product or equivalently Euclidean distance between embeddings encode similarity between locations in terms of their transition probability across multiple scales, forming a cognitive map of adjacency. The combination of non-negativity and inner-product structure naturally induces sparsity, providing a principled explanation for the localized firing fields of place cells without imposing explicit constraints. The temporal parameter that defines the diffusion scale also determines field size, aligning with the hippocampal dorsoventral hierarchy. Our approach constructs global representations efficiently through recursive composition of local transitions, enabling smooth, trap-free navigation and preplay-like trajectory generation. Moreover, theta phase arises intrinsically as the angular relation between embeddings, linking spatial and temporal coding within a single representational geometry.
