Synthetic Data for Discriminating Serotonergic Neurons using Convolutional Neural Networks
Daniele Corradetti, Alessandro Bernardi, Renato Corradetti
TL;DR
This work tackles the challenge of identifying serotonergic neurons, including atypical cells, from electrophysiological recordings where conventional visual criteria are insufficient. It introduces a synthetic data augmentation pipeline that smooths real spike waveforms and fuses them with diverse real noise masks to expand training data for CNN classifiers. A CNN ensemble trained on these synthetic samples achieves strong generalization to non-homogeneous data (e.g., accuracy ~0.94, TPR ~0.962, TNR ~0.888, AUC ~0.926), and a practical case is provided with open-source code. The approach provides a scalable, real-time capable tool for serotonergic neuron identification, with future work pointing to GAN-based augmentation and broader species/dataset applicability.
Abstract
Serotonergic neurons in the raphe nuclei exhibit diverse electrophysiological properties and functional roles, yet conventional identification methods rely on restrictive criteria that likely overlook atypical serotonergic cells. The use of convolutional neural network (CNN) for comprehensive classification of both typical and atypical serotonergic neurons is an interesting one, but the key challenge is often given by the limited experimental data available for training. This study presents a procedure for synthetic data generation that combines smoothed spike waveforms with heterogeneous noise masks from real recordings. This approach expanded the training set while mitigating overfitting of background noise signatures. CNN models trained on the augmented dataset achieved high accuracy (96.2% true positive rate, 88.8% true negative rate) on non-homogeneous test data collected under different experimental conditions than the training, validation and testing data.
