Table of Contents
Fetching ...

Time-Evolving Dynamical System for Learning Latent Representations of Mouse Visual Neural Activity

Liwei Huang, ZhengYu Ma, Liutao Yu, Huihui Zhou, Yonghong Tian

TL;DR

This work addresses learning meaningful, time-aware latent representations from visual neural activity. It introduces TE-ViDS, a time-evolving dynamical system that disentangles stimulus-related and internal state information into external and internal latent representations, respectively, and learns them with a contrastive loss and a time-dependent prior within a sequential VAE framework. The approach yields superior decoding of natural scenes and movies in mouse visual cortex and reveals interpretable latent trajectories, while also uncovering variability across subjects and cortical regions. The findings advance understanding of visual information processing and offer a scalable tool for analyzing neural dynamics under naturalistic stimulation, with code available to reproduce the results.

Abstract

Seeking high-quality representations with latent variable models (LVMs) to reveal the intrinsic correlation between neural activity and behavior or sensory stimuli has attracted much interest. In the study of the biological visual system, naturalistic visual stimuli are inherently high-dimensional and time-dependent, leading to intricate dynamics within visual neural activity. However, most work on LVMs has not explicitly considered neural temporal relationships. To cope with such conditions, we propose Time-Evolving Visual Dynamical System (TE-ViDS), a sequential LVM that decomposes neural activity into low-dimensional latent representations that evolve over time. To better align the model with the characteristics of visual neural activity, we split latent representations into two parts and apply contrastive learning to shape them. Extensive experiments on synthetic datasets and real neural datasets from the mouse visual cortex demonstrate that TE-ViDS achieves the best decoding performance on naturalistic scenes/movies, extracts interpretable latent trajectories that uncover clear underlying neural dynamics, and provides new insights into differences in visual information processing between subjects and between cortical regions. In summary, TE-ViDS is markedly competent in extracting stimulus-relevant embeddings from visual neural activity and contributes to the understanding of visual processing mechanisms. Our codes are available at https://github.com/Grasshlw/Time-Evolving-Visual-Dynamical-System.

Time-Evolving Dynamical System for Learning Latent Representations of Mouse Visual Neural Activity

TL;DR

This work addresses learning meaningful, time-aware latent representations from visual neural activity. It introduces TE-ViDS, a time-evolving dynamical system that disentangles stimulus-related and internal state information into external and internal latent representations, respectively, and learns them with a contrastive loss and a time-dependent prior within a sequential VAE framework. The approach yields superior decoding of natural scenes and movies in mouse visual cortex and reveals interpretable latent trajectories, while also uncovering variability across subjects and cortical regions. The findings advance understanding of visual information processing and offer a scalable tool for analyzing neural dynamics under naturalistic stimulation, with code available to reproduce the results.

Abstract

Seeking high-quality representations with latent variable models (LVMs) to reveal the intrinsic correlation between neural activity and behavior or sensory stimuli has attracted much interest. In the study of the biological visual system, naturalistic visual stimuli are inherently high-dimensional and time-dependent, leading to intricate dynamics within visual neural activity. However, most work on LVMs has not explicitly considered neural temporal relationships. To cope with such conditions, we propose Time-Evolving Visual Dynamical System (TE-ViDS), a sequential LVM that decomposes neural activity into low-dimensional latent representations that evolve over time. To better align the model with the characteristics of visual neural activity, we split latent representations into two parts and apply contrastive learning to shape them. Extensive experiments on synthetic datasets and real neural datasets from the mouse visual cortex demonstrate that TE-ViDS achieves the best decoding performance on naturalistic scenes/movies, extracts interpretable latent trajectories that uncover clear underlying neural dynamics, and provides new insights into differences in visual information processing between subjects and between cortical regions. In summary, TE-ViDS is markedly competent in extracting stimulus-relevant embeddings from visual neural activity and contributes to the understanding of visual processing mechanisms. Our codes are available at https://github.com/Grasshlw/Time-Evolving-Visual-Dynamical-System.
Paper Structure (26 sections, 16 equations, 12 figures, 9 tables)

This paper contains 26 sections, 16 equations, 12 figures, 9 tables.

Figures (12)

  • Figure 1: The method overview. A. The illustration of TE-ViDS for analyzing visual neural activity in the mouse visual cortex. The encoder extracts spatial features from sequential spike data. The latent variables are evolved conditionally on features of the encoder and RNNs' state factors over time. The decoder maps latent variables to inferred firing rates. B. The illustration of different learning objectives for the two parts of latent representations of TE-ViDS. For external latent representations, we apply contrastive loss to encourage them to distinguish the stimulus-relevant components. Given a reference sample (white dot), the red dot is a positive sample and the orange dots are negative samples. For internal latent representations, we use the KL divergence to constrain their distribution to a time-dependent prior distribution.
  • Figure 2: Results on synthetic datasets. A. The true latent variables of the non-temporal dataset. B-D. The inferred latent variables of our model and some alternative models. E. The reconstruction scores of all models on the non-temporal and temporal datasets. The standard error is computed on 10 runs with different random initializations.
  • Figure 3: Results on the mouse neural dataset under natural scene stimuli. A. The decoding scores (%) of the full, external and internal latent representations of TE-ViDS for 118 natural scenes. B. RSMs computed on the original neural representations and TE-ViDS's latent representations, respectively (Mouse 1 and Mouse 2). Each element in a matrix is the similarity between two trials' representations. Each small square involves comparisons between two natural scenes, containing 50 trials.
  • Figure 4: Results on the mouse neural dataset under natural movie stimuli. A. The decoding scores (%) of the full, external and internal latent representations of TE-ViDS. B. The decoding scores (%) for movie frames under different constraint windows of predicted frames and the true frames. C-E. Visualization results of latent trajectories (Mouse 2). Each color corresponds to all frames within 1s. Small dots denote one frame. Large dots denote the average among a group of frames. The red dashed line connects all averages. F. RSMs computed on the original neural representations and TE-ViDS's latent representations (Mouse 2). Each element is the similarity between two frames' representations. Each small square involves comparisons between two trials, containing 900 frames.
  • Figure 5: The decoding scores (%) of TE-ViDS for natural scenes/natural movie frames on six mouse visual cortical region datasets. The box plots are based on 10 runs with different random initializations.
  • ...and 7 more figures