Table of Contents
Fetching ...

Diffusion-Based Generation of Neural Activity from Disentangled Latent Codes

Jonathan D. McCart, Andrew R. Sedler, Christopher Versteeg, Domenick Mifsud, Mattia Rigotti-Thompson, Chethan Pandarinath

TL;DR

GNOCCHI introduces a diffusion-based, conditional generative model that learns disentangled latent codes $\mathbf{c} \in \mathbb{R}^L$ from time-series neural activity using an auxiliary encoder and a denoising network conditioned on $\mathbf{c}$; the forward process is $\tilde{\mathbf{x}}_i = \sqrt{\bar{\alpha}_i}\mathbf{x}_0 + \sqrt{1-\bar{\alpha}_i}\epsilon$ with $\epsilon \sim \mathcal{N}(0, I)$, and training uses score, reconstruction, and MMD losses to promote high-information conditioning. GNOCCHI is compared to LFADS on both synthetic motor-task data and real monkey M1 recordings, showing more structured and disentangled latent spaces and enabling accurate generation of samples for unseen behavioral conditions via latent navigation. The results demonstrate that GNOCCHI can linearly traverse latent axes to produce controlled changes in behavior while preserving other variables, outperforming LFADS in disentanglement and held-out generalization. This approach advances unsupervised discovery of interpretable neural representations and holds promise for data augmentation and brain-machine interface applications, with future work extending to jointly generating neural activity and behavior and to continuous-time, unconstrained data.

Abstract

Recent advances in recording technology have allowed neuroscientists to monitor activity from thousands of neurons simultaneously. Latent variable models are increasingly valuable for distilling these recordings into compact and interpretable representations. Here we propose a new approach to neural data analysis that leverages advances in conditional generative modeling to enable the unsupervised inference of disentangled behavioral variables from recorded neural activity. Our approach builds on InfoDiffusion, which augments diffusion models with a set of latent variables that capture important factors of variation in the data. We apply our model, called Generating Neural Observations Conditioned on Codes with High Information (GNOCCHI), to time series neural data and test its application to synthetic and biological recordings of neural activity during reaching. In comparison to a VAE-based sequential autoencoder, GNOCCHI learns higher-quality latent spaces that are more clearly structured and more disentangled with respect to key behavioral variables. These properties enable accurate generation of novel samples (unseen behavioral conditions) through simple linear traversal of the latent spaces produced by GNOCCHI. Our work demonstrates the potential of unsupervised, information-based models for the discovery of interpretable latent spaces from neural data, enabling researchers to generate high-quality samples from unseen conditions.

Diffusion-Based Generation of Neural Activity from Disentangled Latent Codes

TL;DR

GNOCCHI introduces a diffusion-based, conditional generative model that learns disentangled latent codes from time-series neural activity using an auxiliary encoder and a denoising network conditioned on ; the forward process is with , and training uses score, reconstruction, and MMD losses to promote high-information conditioning. GNOCCHI is compared to LFADS on both synthetic motor-task data and real monkey M1 recordings, showing more structured and disentangled latent spaces and enabling accurate generation of samples for unseen behavioral conditions via latent navigation. The results demonstrate that GNOCCHI can linearly traverse latent axes to produce controlled changes in behavior while preserving other variables, outperforming LFADS in disentanglement and held-out generalization. This approach advances unsupervised discovery of interpretable neural representations and holds promise for data augmentation and brain-machine interface applications, with future work extending to jointly generating neural activity and behavior and to continuous-time, unconstrained data.

Abstract

Recent advances in recording technology have allowed neuroscientists to monitor activity from thousands of neurons simultaneously. Latent variable models are increasingly valuable for distilling these recordings into compact and interpretable representations. Here we propose a new approach to neural data analysis that leverages advances in conditional generative modeling to enable the unsupervised inference of disentangled behavioral variables from recorded neural activity. Our approach builds on InfoDiffusion, which augments diffusion models with a set of latent variables that capture important factors of variation in the data. We apply our model, called Generating Neural Observations Conditioned on Codes with High Information (GNOCCHI), to time series neural data and test its application to synthetic and biological recordings of neural activity during reaching. In comparison to a VAE-based sequential autoencoder, GNOCCHI learns higher-quality latent spaces that are more clearly structured and more disentangled with respect to key behavioral variables. These properties enable accurate generation of novel samples (unseen behavioral conditions) through simple linear traversal of the latent spaces produced by GNOCCHI. Our work demonstrates the potential of unsupervised, information-based models for the discovery of interpretable latent spaces from neural data, enabling researchers to generate high-quality samples from unseen conditions.
Paper Structure (28 sections, 6 equations, 5 figures, 6 tables)

This paper contains 28 sections, 6 equations, 5 figures, 6 tables.

Figures (5)

  • Figure 1: GNOCCHI model overview. A) Codes are inferred from neural activity and used for conditional generation. Using the noise predicted by the denoiser, the neural activity is reconstructed and penalized against the original neural activity (Mean Squared Error). During inference, Gaussian white noise is iteratively transformed into a sample using the inferred code and the denoiser. B) A schematic of how a movement in code space corresponds to a change in target position. C) Low-dimensional visualizations of the generated neural activity during conditional (left) and unconditional (right) generation.
  • Figure 2: Validating GNOCCHI on realistic synthetic neural activity. A) RNN task training overview. We trained an RNN to control a biomechanical effector to manipulate the endpoint to acquire targets on a grid. The RNN received task inputs indicating the target location and the go cue time, as well as sensory feedback inputs related the effector endpoint, as well as muscle lengths and velocities. The RNN produced activations of a set of muscles to control the arm based on these inputs. B) Visualizing neural responses colored by the relative angle of the reach. Both generative models produced unit responses that closely resemble both the timecourse and behavioral structure of the ground truth data. C) Top 3 principal components of the codes. Codes exhibited organization according to several behavioral variables; here they are colored by the target x and y locations of individual trials. D) Code signal-to-noise ratio. The codes learned by GNOCCHI had a substantially higher SNR for target position than those learned by LFADS, indicating a closer relationship to behavior. E) Additionally, computing the $R^2$ between the ground truth and generated unit activity quantifies that GNOCCHI-generated activity matched the ground truth with comparable accuracy to LFADS.
  • Figure 3: Controllable generation of neural activity for novel behavioral conditions. A) Target-tuned subspaces. Each model contained code subspaces with a close correspondence to target location. Codes for the held-in conditions are colored by the target x location, and the held-out codes are black. Code organization matched the target grid for seen and unseen conditions for both models. B, C) Behavior decoded from trials produced by latent navigation. When we generate new samples by navigating along the target x (left) and target y (right) dimensions of code space, the trajectories produced by GNOCCHI vary in the direction of interest and largely isolate the intended variable (endpoint location), while for LFADS there is coupling with other variables such as the start location of the reach. D) Unintended movement and the orthogonality of behavioral dimensions in code space. We quantified unwanted variation in the decoded behavior as the total absolute deviation of each of the variables that should remain fixed during latent navigation of a single variable (green). Consistent with the decoding visualization (B and C), unwanted movements produced by LFADS were substantially higher than those produced by GNOCCHI. Additionally, we compute the normalized dot product between the vectors that define each behavioral direction during latent navigation (blue), and found that the directions in the code space of LFADS are less orthogonal / more strongly coupled than for GNOCCHI. E, F) Heldout predicted target position from inferred codes. Passing the heldout trials through each trained model, we find that both GNOCCHI and LFADS are capable of representing trials with structure that generalizes well to the task.
  • Figure 4: GNOCCHI generates realistic neural activity and captures held-out conditions in a biological dataset. A) Task schematic. Recordings from primary motor cortex of a monkey (top) while performing a random-target reaching task (bottom). B) Single neuron responses from two example neurons (columns) in the window [-200, 500] aligned to movement onset, colored by reach direction. Top: Smoothed single trial activity. Middle, Bottom: Predicted single trial activity from LFADS/GNOCCHI, respectively. C) Visualization of top 3 PCs of the code space for GNOCCHI (left) and LFADS (right), color coded by Target X position. D) $R^2$ for predicted activity of individual neurons across validation trials for LFADS (x-axis) and GNOCCHI (y-axis). Points above the unity line denote neurons in which GNOCCHI predicted activity that more closely resembled the smoothed firing rates. E) Diagram of heldout generalization experiment. Heldin (heldout) trial target locations indicated with filled (unfilled) circles. Predicted target location from inferred codes indicated by colored $\times$s (LFADS: red, GNOCCHI: blue). F) Scatter plot of error between actual target location and predicted target location (from E) for GNOCCHI (x-axis) and LFADS (y-axis). Points below the unity line indicate trials where the GNOCCHI prediction was closer to the true target than LFADS.
  • Figure 5: Smoothing input to GNOCCHI with AutoLFADS vs. Gaussian kernel. We show empirically that the prediction of held-out targets from GNOCCHI codes is not affected by whether the data are smoothed using AutoLFADS or a Gaussian kernel.