SimSort: A Data-Driven Framework for Spike Sorting by Large-Scale Electrophysiology Simulation
Yimu Zhang, Dongqi Han, Yansen Wang, Zhenning Lv, Yu Gu, Dongsheng Li
TL;DR
SimSort tackles the ground-truth deficit in spike sorting by pretraining a fully automated pipeline on a large-scale, biophysically realistic simulated dataset. A transformer-based spike detector paired with a contrastive-learning–driven spike-identification module learns robust, transferable representations that generalize to real neural recordings without fine-tuning. The approach achieves strong zero-shot performance across multiple benchmarks and real-world data, with further gains from limited fine-tuning and clear scaling behavior as data size grows. This simulation-driven pretraining paradigm offers a scalable, plug-and-play solution for spike sorting in diverse electrophysiology settings.
Abstract
Spike sorting is an essential process in neural recording, which identifies and separates electrical signals from individual neurons recorded by electrodes in the brain, enabling researchers to study how specific neurons communicate and process information. Although there exist a number of spike sorting methods which have contributed to significant neuroscientific breakthroughs, many are heuristically designed, making it challenging to verify their correctness due to the difficulty of obtaining ground truth labels from real-world neural recordings. In this work, we explore a data-driven, deep learning-based approach. We begin by creating a large-scale dataset through electrophysiology simulations using biologically realistic computational models. We then present SimSort, a pretraining framework for spike sorting. Trained solely on simulated data, SimSort demonstrates zero-shot generalizability to real-world spike sorting tasks, yielding consistent improvements over existing methods across multiple benchmarks. These results highlight the potential of simulation-driven pretraining to enhance the robustness and scalability of spike sorting in experimental neuroscience.
