Table of Contents
Fetching ...

Factor Analysis with Correlated Topic Model for Multi-Modal Data

Małgorzata Łazęcka, Ewa Szczurek

TL;DR

FACTM presents a Bayesian framework that unifies factor analysis with a correlated topic model to handle both simple and structured multimodal data. By introducing a sample-specific link variable $\mu_n$ that ties the FA and CTM components and a supervised rotation mechanism, FACTM delivers interpretable latent factors and meaningful topic structures across views. Through extensive simulations and real datasets (video sentiment, music genres, and COVID-19 scRNA-seq/CT/cytometry), FACTM demonstrates superior parameter estimation, competitive predictive power, and enhanced interpretability, including biologically coherent clustering in complex data. This approach enables robust, interpretable integration of heterogeneous data modalities with practical impact on discovery and classification tasks.

Abstract

Integrating various data modalities brings valuable insights into underlying phenomena. Multimodal factor analysis (FA) uncovers shared axes of variation underlying different simple data modalities, where each sample is represented by a vector of features. However, FA is not suited for structured data modalities, such as text or single cell sequencing data, where multiple data points are measured per each sample and exhibit a clustering structure. To overcome this challenge, we introduce FACTM, a novel, multi-view and multi-structure Bayesian model that combines FA with correlated topic modeling and is optimized using variational inference. Additionally, we introduce a method for rotating latent factors to enhance interpretability with respect to binary features. On text and video benchmarks as well as real-world music and COVID-19 datasets, we demonstrate that FACTM outperforms other methods in identifying clusters in structured data, and integrating them with simple modalities via the inference of shared, interpretable factors.

Factor Analysis with Correlated Topic Model for Multi-Modal Data

TL;DR

FACTM presents a Bayesian framework that unifies factor analysis with a correlated topic model to handle both simple and structured multimodal data. By introducing a sample-specific link variable that ties the FA and CTM components and a supervised rotation mechanism, FACTM delivers interpretable latent factors and meaningful topic structures across views. Through extensive simulations and real datasets (video sentiment, music genres, and COVID-19 scRNA-seq/CT/cytometry), FACTM demonstrates superior parameter estimation, competitive predictive power, and enhanced interpretability, including biologically coherent clustering in complex data. This approach enables robust, interpretable integration of heterogeneous data modalities with practical impact on discovery and classification tasks.

Abstract

Integrating various data modalities brings valuable insights into underlying phenomena. Multimodal factor analysis (FA) uncovers shared axes of variation underlying different simple data modalities, where each sample is represented by a vector of features. However, FA is not suited for structured data modalities, such as text or single cell sequencing data, where multiple data points are measured per each sample and exhibit a clustering structure. To overcome this challenge, we introduce FACTM, a novel, multi-view and multi-structure Bayesian model that combines FA with correlated topic modeling and is optimized using variational inference. Additionally, we introduce a method for rotating latent factors to enhance interpretability with respect to binary features. On text and video benchmarks as well as real-world music and COVID-19 datasets, we demonstrate that FACTM outperforms other methods in identifying clusters in structured data, and integrating them with simple modalities via the inference of shared, interpretable factors.

Paper Structure

This paper contains 45 sections, 17 equations, 16 figures, 5 tables.

Figures (16)

  • Figure 1: Graphical representation of FACTM. A single structured view is shown (in blue), although any number is possible.
  • Figure 2: Comparison of true factors and optimally reordered latent factors in factor analysis models.
  • Figure 3: Comparison of true parameters and inferred parameters following the optimal reordering of topics in topic models.
  • Figure 4: Comparison of true and inferred population-level variables in FACTM and CTM (other topic models do not account for population mean and covariance of topics). The Frobenius distance is computed relative to the Frobenius norm of the true covariance matrix. $\tilde{\Sigma}^{(0)}$ represents the covariance matrix scaled to have ones on the diagonal. Dashed lines indicate optimal performance.
  • Figure 5: A. Topic's average positivity and negativity, measured as the weighted average of positive/negative words in a topic. B&C. The values of $\eta_{\cdot, 3}$ (B) and $\eta_{\cdot, 4}$ (C) split by class membership of the samples with two-sided Wilcoxon pairwise tests. Stars denote significance of Bonferroni-adjusted p-values: *** $< 0.001$, ** $< 0.01$, * $< 0.05$.
  • ...and 11 more figures