Joint Linked Component Analysis for Multiview Data

Lin Xiao; Luo Xiao

Joint Linked Component Analysis for Multiview Data

Lin Xiao, Luo Xiao

TL;DR

The paper introduces joint_LCA, a method for multiview data that jointly estimates the rank of the shared latent subspace and the view-specific loading matrices, while decomposing each view into joint and individual components. It forms a penalized optimization on cross-covariances with a group-sparsity penalty to select the common rank and uses an alternating Procrustes-based algorithm plus a per-component refinement, including a model refitting step to reduce shrinkage bias. Through extensive simulations and real-data applications across biological and socio-economic domains, it demonstrates robust rank identification, accurate loading estimation, and interpretable common components, often outperforming sequential CCA-based approaches and JIVE. The method is adaptable to multiple views and sets the stage for future high-dimensional extensions with sparsity constraints and scalable computation. Overall, joint_LCA provides a principled, data-driven approach to uncover and quantify shared structure across diverse data sources, with practical implications for integrative analyses.

Abstract

In this work, we propose the joint linked component analysis (joint\_LCA) for multiview data. Unlike classic methods which extract the shared components in a sequential manner, the objective of joint\_LCA is to identify the view-specific loading matrices and the rank of the common latent subspace simultaneously. We formulate a matrix decomposition model where a joint structure and an individual structure are present in each data view, which enables us to arrive at a clean svd representation for the cross covariance between any pair of data views. An objective function with a novel penalty term is then proposed to achieve simultaneous estimation and rank selection. In addition, a refitting procedure is employed as a remedy to reduce the shrinkage bias caused by the penalization.

Joint Linked Component Analysis for Multiview Data

TL;DR

Abstract

Paper Structure (16 sections, 48 equations, 9 figures, 5 tables)

This paper contains 16 sections, 48 equations, 9 figures, 5 tables.

Introduction
Proposed Method
Estimation
Algorithm
Initialization
Model Refitting Given a Pre-specified Rank
Simulation Study
Three Data Views
Four Data Views
Real Data Application
Nutrimouse Data
Boston Housing Data
Russett Data
Multiview Single Cell Data
Discussion
...and 1 more sections

Figures (9)

Figure 1: Displayed are boxplots of estimation error $\sum_{i=1}^{3}\|\hat{V_i}\hat{V_i}^{\intercal}-V_iV_i^{\intercal} \|_2^2/3\|V_iV_i^{\intercal} \|_2^2$ for joint_LCA, JIVE_perm, JIVE_BIC, mCCA and mCIA where there are $I=3$ data views and the rank of the joint structure is $r_0=2$. Panel (a) and (b) show results for case I where $D_i$ and $D_{i0}$ are all generated from the standard uniform distribution; panel (c) and (d) show results for case II where $D_i$ and $D_{i0}$ are generated from the uniform distribution based on $[0.5\sqrt{5},\sqrt{5}]$ and $[0.5, 1]$ respectively.
Figure 2: Displayed are boxplots of estimation error $\sum_{i=1}^{3}\|\hat{V_i}\hat{V_i}^{\intercal}-V_iV_i^{\intercal} \|_2^2/3\|V_iV_i^{\intercal} \|_2^2$ for joint_LCA, JIVE_perm, JIVE_BIC, mCCA and mCIA where there are $I=3$ data views and the rank of the joint structure is $r_0=5$. Panel (a) and (b) show results for case I where $D_i$ and $D_{i0}$ are all generated from the standard uniform distribution; panel (c) and (d) show results for case II where $D_i$ and $D_{i0}$ are generated from the uniform distribution based on $[0.5\sqrt{5},\sqrt{5}]$ and $[0.5, 1]$ respectively.
Figure 3: Displayed are boxplots of estimation error $\sum_{i=1}^{3}\|\hat{V_i}\hat{V_i}^{\intercal}-V_iV_i^{\intercal} \|_2^2/4\|V_iV_i^{\intercal} \|_2^2$ for joint_LCA, JIVE_perm, JIVE_BIC, mCCA and mCIA where there are $I=4$ data views and the rank of the joint structure is $r_0=2$. Panel (a) and (b) show results for case I where $D_i$ and $D_{i0}$ are all generated from the standard uniform distribution; panel (c) and (d) show results for case II where $D_i$ and $D_{i0}$ are generated from the uniform distribution based on $[0.5\sqrt{5},\sqrt{5}]$ and $[0.5, 1]$ respectively.
Figure 4: Displayed are boxplots of estimation error $\sum_{i=1}^{3}\|\hat{V_i}\hat{V_i}^{\intercal}-V_iV_i^{\intercal} \|_2^2/4\|V_iV_i^{\intercal} \|_2^2$ for joint_LCA, JIVE_perm, JIVE_BIC, mCCA and mCIA where there are $I=3$ data views and the rank of the joint structure is $r_0=5$. Panel (a) and (b) show results for case I where $D_i$ and $D_{i0}$ are all generated from the standard uniform distribution; panel (c) and (d) show results for case II where $D_i$ and $D_{i0}$ are generated from the uniform distribution based on $[0.5\sqrt{5},\sqrt{5}]$ and $[0.5, 1]$ respectively.
Figure 5: Nutrimouse
...and 4 more figures

Joint Linked Component Analysis for Multiview Data

TL;DR

Abstract

Joint Linked Component Analysis for Multiview Data

Authors

TL;DR

Abstract

Table of Contents

Figures (9)