Table of Contents
Fetching ...

Generative Principal Component Regression via Variational Inference

Austin Talbot, Corey J Keller, David E Carlson, Alex V Kotlar

TL;DR

This paper tackles the challenge of designing manipulation targets from latent-variable models when predictive signals lie in low-variance components. It introduces generative principal component regression (gPCR), a linear, variational objective that emphasizes the predictive distribution by using the generative posterior and a targeted lower bound, thus aligning latent features with outcomes. Through synthetic data and two neural datasets (stress and social behavior), gPCR outperforms standard PCR and exposes critical limitations of SVAE loadings for target selection, while offering interpretable, smoother predictive loadings. The approach preserves the benefits of generative modeling (imputation, clustering, anomaly detection) and enables reliable stimulation-target design, with broad applicability to latent-variable methods in neuroscience and beyond.

Abstract

The ability to manipulate complex systems, such as the brain, to modify specific outcomes has far-reaching implications, particularly in the treatment of psychiatric disorders. One approach to designing appropriate manipulations is to target key features of predictive models. While generative latent variable models, such as probabilistic principal component analysis (PPCA), is a powerful tool for identifying targets, they struggle incorporating information relevant to low-variance outcomes into the latent space. When stimulation targets are designed on the latent space in such a scenario, the intervention can be suboptimal with minimal efficacy. To address this problem, we develop a novel objective based on supervised variational autoencoders (SVAEs) that enforces such information is represented in the latent space. The novel objective can be used with linear models, such as PPCA, which we refer to as generative principal component regression (gPCR). We show in simulations that gPCR dramatically improves target selection in manipulation as compared to standard PCR and SVAEs. As part of these simulations, we develop a metric for detecting when relevant information is not properly incorporated into the loadings. We then show in two neural datasets related to stress and social behavior in which gPCR dramatically outperforms PCR in predictive performance and that SVAEs exhibit low incorporation of relevant information into the loadings. Overall, this work suggests that our method significantly improves target selection for manipulation using latent variable models over competitor inference schemes.

Generative Principal Component Regression via Variational Inference

TL;DR

This paper tackles the challenge of designing manipulation targets from latent-variable models when predictive signals lie in low-variance components. It introduces generative principal component regression (gPCR), a linear, variational objective that emphasizes the predictive distribution by using the generative posterior and a targeted lower bound, thus aligning latent features with outcomes. Through synthetic data and two neural datasets (stress and social behavior), gPCR outperforms standard PCR and exposes critical limitations of SVAE loadings for target selection, while offering interpretable, smoother predictive loadings. The approach preserves the benefits of generative modeling (imputation, clustering, anomaly detection) and enables reliable stimulation-target design, with broad applicability to latent-variable methods in neuroscience and beyond.

Abstract

The ability to manipulate complex systems, such as the brain, to modify specific outcomes has far-reaching implications, particularly in the treatment of psychiatric disorders. One approach to designing appropriate manipulations is to target key features of predictive models. While generative latent variable models, such as probabilistic principal component analysis (PPCA), is a powerful tool for identifying targets, they struggle incorporating information relevant to low-variance outcomes into the latent space. When stimulation targets are designed on the latent space in such a scenario, the intervention can be suboptimal with minimal efficacy. To address this problem, we develop a novel objective based on supervised variational autoencoders (SVAEs) that enforces such information is represented in the latent space. The novel objective can be used with linear models, such as PPCA, which we refer to as generative principal component regression (gPCR). We show in simulations that gPCR dramatically improves target selection in manipulation as compared to standard PCR and SVAEs. As part of these simulations, we develop a metric for detecting when relevant information is not properly incorporated into the loadings. We then show in two neural datasets related to stress and social behavior in which gPCR dramatically outperforms PCR in predictive performance and that SVAEs exhibit low incorporation of relevant information into the loadings. Overall, this work suggests that our method significantly improves target selection for manipulation using latent variable models over competitor inference schemes.
Paper Structure (13 sections, 6 equations, 5 figures, 2 tables)

This paper contains 13 sections, 6 equations, 5 figures, 2 tables.

Figures (5)

  • Figure 1: The top left plot shows the encoder and decoder of the SVAE, the top middle shows the predictive coefficients as detected by PCR, and the top right shows the loadings of our robust model. In the bottom row we plot the encoder mean versus the posterior mean of an SVAE for the supervised and an unsupervised factor. In the middle we plot the predictive ability of the different models via an ROC curve. Finally, the bottom right shows the distribution of stimulation efficacies based on the different models.
  • Figure 2: the predictive coefficients of stress versus nonstress conditions in four brain regions obtained via different regularization methods. positive values indicate that spectral power is enhanced during stress while negative values indicate suppression.
  • Figure 3: The predictive coefficients of social versus nonsocial interactions. Positive values indicate power increases in social activity, negative values indicate suppression.
  • Figure 4: This plot shows relevant quantities of the SVAE learned on the TST dataset. The top left plot shows the encoder and decoder of the SVAE, the top middle shows the predictive coefficients as detected by PCR, and the top right shows the loadings of our robust model. In the bottom row we plot the encoder mean versus the posterior mean of an SVAE for the supervised and an unsupervised factor. In the middle we plot the predictive ability of the different models via an ROC curve. Finally, the bottom right shows the distribution of stimulation efficacies based on the different models.
  • Figure 5: This plot shows relevant quantities of the SVAE learned on the social preference dataset, with the interpretation matching figure \ref{['fig_tst_svae']}.