Table of Contents
Fetching ...

Bayesian Multi-Group Functional Factor Models with Parameter-Expanded Cumulative Shrinkage Priors

Xuanye Dai, Anna Gottard, Michele Guindani, Marina Vannucci

Abstract

Functional data consist of trajectories observed over a continuous domain, such as time, space, or wavelength. Here we consider curves observed on different groups of subjects and propose a Bayesian multi-group functional factor analysis framework that jointly models the data via an explicit decomposition into group-specific mean functions and latent components that capture both common and distinct latent structures across the groups. We represent these functional components as linear combinations of a common set of B-spline bases, achieving a low-rank representation of the latent factors. We further impose a parameter-expanded cumulative shrinkage process prior on the factor loadings, which induces increasing shrinkage and automatically selects the number of active shared and group-specific factors. We evaluate the model's performance through simulation studies and show that the model accurately recovers the number of underlying factors and effectively distinguishes variations in functional observations driven by shared versus group-specific complex structures under various scenarios. For real data analysis, we apply the model to EEG data on alcoholic and healthy subjects and identify shared latent factors, that capture canonical characteristic components of the EEG curves, along with group-specific factors that reveal specific neural activity patterns.

Bayesian Multi-Group Functional Factor Models with Parameter-Expanded Cumulative Shrinkage Priors

Abstract

Functional data consist of trajectories observed over a continuous domain, such as time, space, or wavelength. Here we consider curves observed on different groups of subjects and propose a Bayesian multi-group functional factor analysis framework that jointly models the data via an explicit decomposition into group-specific mean functions and latent components that capture both common and distinct latent structures across the groups. We represent these functional components as linear combinations of a common set of B-spline bases, achieving a low-rank representation of the latent factors. We further impose a parameter-expanded cumulative shrinkage process prior on the factor loadings, which induces increasing shrinkage and automatically selects the number of active shared and group-specific factors. We evaluate the model's performance through simulation studies and show that the model accurately recovers the number of underlying factors and effectively distinguishes variations in functional observations driven by shared versus group-specific complex structures under various scenarios. For real data analysis, we apply the model to EEG data on alcoholic and healthy subjects and identify shared latent factors, that capture canonical characteristic components of the EEG curves, along with group-specific factors that reveal specific neural activity patterns.

Paper Structure

This paper contains 17 sections, 20 equations, 11 figures, 3 tables, 1 algorithm.

Figures (11)

  • Figure 1: Simulation Study. Posterior frequencies of the $15$ most frequently sampled factor configurations for one replicate from Scenario A $(3,2,2)$ with $n_{1}=40$, and $n_{2}=80$, based on $10,000$ retained MCMC draws after burn-in. The modal factor configuration coincides with the true configuration $(3,2,2)$.
  • Figure 2: Simulation Study. Comparison of heatmaps for the true (top) and estimated (bottom) covariance-derived shared and group-specific factor loadings for the same replicate from Scenario A $(3,2,2)$ with $n_{1}=40$, and $n_{2}=80$ shown in Figure \ref{['fig:Figure2_fac_conf_sceA3']}.
  • Figure 3: Simulation Study. True and estimated curves $f_{s}$, together with pointwise $95\%$ credible intervals, for two representative subjects from each group for the same replicate from Scenario A $(3,2,2)$ with $n_{1}=40$, and $n_{2}=80$ shown in Figure \ref{['fig:Figure2_fac_conf_sceA3']}.
  • Figure 4: Simulation Study. Comparison of simulated curves, $Y_s$, and true and estimated curves $f_{s}$ for all subjects in the two groups for the same replicate from Scenario A $(3,2,2)$ with $n_{1}=40$, and $n_{2}=80$ shown in Figure \ref{['fig:Figure2_fac_conf_sceA3']}. Top line: simulated $Y_s$ for 40 subjects (left) and 80 subjects (right) over the 60 time points, with subject-level curves in green and averaged curves in blue; Middle line: true curves $f_{s}$ for the two groups, with subject-level curves in green and averaged curves in blue; Bottom line: estimated curves $\widehat{f}_{s}$ , shown as posterior means, with subject-level curves in green and averaged curves in blue.
  • Figure 5: Simulation Study. Boxplots of the RV coefficients between the true shared and group-specific factor loadings, $B\Lambda_{\mathrm{true}}$ and $B\Phi_{s,\mathrm{true}}$, and the estimated covariance-derived shared and group-specific factor loadings, $\widetilde{\Lambda}$ and $\widetilde{\Phi}_{s}$, averaged across replicates where the estimates of $L^{*}$, $K_{1}^{*}$, and $K_{2}^{*}$ equal the true values, among 100 simulated datasets, for all scenarios described in Section \ref{['subsec:gatagen']}.
  • ...and 6 more figures