Know Thyself by Knowing Others: Learning Neuron Identity from Population Context

Vinam Arora; Divyansha Lachi; Ian J. Knight; Mehdi Azabou; Blake Richards; Cole L. Hurwitz; Josh Siegle; Eva L. Dyer

Know Thyself by Knowing Others: Learning Neuron Identity from Population Context

Vinam Arora, Divyansha Lachi, Ian J. Knight, Mehdi Azabou, Blake Richards, Cole L. Hurwitz, Josh Siegle, Eva L. Dyer

TL;DR

NuCLR introduces a self-supervised, population-contextual framework to learn neuron-level representations from large-scale neural activity. By using a permutation-equivariant spatiotemporal transformer and a two-view contrastive objective, it yields stable, discriminative neuron embeddings that transfer zero-shot across sessions and animals. The approach achieves state-of-the-art zero-shot decoding of cell type and brain region on multiple datasets and demonstrates data-efficient labeling and scalable gains with more animals. Ablations confirm the critical role of population context and dropout regularization. These results suggest that large, diverse, unlabeled neural datasets can support general-purpose neuron identity estimation across modalities and subjects.

Abstract

Neurons process information in ways that depend on their cell type, connectivity, and the brain region in which they are embedded. However, inferring these factors from neural activity remains a significant challenge. To build general-purpose representations that allow for resolving information about a neuron's identity, we introduce NuCLR, a self-supervised framework that aims to learn representations of neural activity that allow for differentiating one neuron from the rest. NuCLR brings together views of the same neuron observed at different times and across different stimuli and uses a contrastive objective to pull these representations together. To capture population context without assuming any fixed neuron ordering, we build a spatiotemporal transformer that integrates activity in a permutation-equivariant manner. Across multiple electrophysiology and calcium imaging datasets, a linear decoding evaluation on top of NuCLR representations achieves a new state-of-the-art for both cell type and brain region decoding tasks, and demonstrates strong zero-shot generalization to unseen animals. We present the first systematic scaling analysis for neuron-level representation learning, showing that increasing the number of animals used during pretraining consistently improves downstream performance. The learned representations are also label-efficient, requiring only a small fraction of labeled samples to achieve competitive performance. These results highlight how large, diverse neural datasets enable models to recover information about neuron identity that generalize across animals. Code is available at https://github.com/nerdslab/nuclr.

Know Thyself by Knowing Others: Learning Neuron Identity from Population Context

TL;DR

Abstract

Know Thyself by Knowing Others: Learning Neuron Identity from Population Context

TL;DR

Abstract

Paper Structure

Table of Contents

Figures (7)