Knowledge Transfer across Multiple Principal Component Analysis Studies

Zeyu Li; Kangxiang Qin; Yong He; Wang Zhou; Xinsheng Zhang

Knowledge Transfer across Multiple Principal Component Analysis Studies

Zeyu Li, Kangxiang Qin, Yong He, Wang Zhou, Xinsheng Zhang

TL;DR

The paper develops a two-step transfer-learning framework for unsupervised PCA across multiple studies by extracting shared subspace information with a Grassmannian barycenter and then debiasing to recover the target's private subspace. It establishes that knowledge transfer can enlarge the eigenvalue gap to facilitate private-subspace estimation and proves asymptotic normality for bilinear forms of spectral projectors under weaker conditions. When informative sources are unknown, it introduces a rectified Grassmannian K-means approach for dataset selection, enabling scalable and robust performance across many sources. The theory is extended to elliptical PCA to handle heavy-tailed data, and the practical value is demonstrated via simulations and a real activity-recognition dataset, showing improved estimation and inference with reduced variance after transfer.

Abstract

Transfer learning has aroused great interest in the statistical community. In this article, we focus on knowledge transfer for unsupervised learning tasks in contrast to the supervised learning tasks in the literature. Given the transferable source populations, we propose a two-step transfer learning algorithm to extract useful information from multiple source principal component analysis (PCA) studies, thereby enhancing estimation accuracy for the target PCA task. In the first step, we integrate the shared subspace information across multiple studies by a proposed method named as Grassmannian barycenter, instead of directly performing PCA on the pooled dataset. The proposed Grassmannian barycenter method enjoys robustness and computational advantages in more general cases. Then the resulting estimator for the shared subspace from the first step is further utilized to estimate the target private subspace in the second step. Our theoretical analysis credits the gain of knowledge transfer between PCA studies to the enlarged eigenvalue gap, which is different from the existing supervised transfer learning tasks where sparsity plays the central role. In addition, we prove that the bilinear forms of the empirical spectral projectors have asymptotic normality under weaker eigenvalue gap conditions after knowledge transfer. When the set of informativesources is unknown, we endow our algorithm with the capability of useful dataset selection by solving a rectified optimization problem on the Grassmann manifold, which in turn leads to a computationally friendly rectified Grassmannian K-means procedure. In the end, extensive numerical simulation results and a real data case concerning activity recognition are reported to support our theoretical claims and to illustrate the empirical usefulness of the proposed transfer learning methods.

Knowledge Transfer across Multiple Principal Component Analysis Studies

TL;DR

Abstract

Paper Structure (32 sections, 15 theorems, 97 equations, 7 figures, 2 algorithms)

This paper contains 32 sections, 15 theorems, 97 equations, 7 figures, 2 algorithms.

Introduction
Knowledge transfer framework
Closely related works and our contributions
Organization and notations
Methodology
Oracle knowledge transfer
Unknown informative sources
Statistical Theory
Oracle knowledge transfer
Asymptotic normality of bilinear forms
Non-oracle knowledge transfer
Extension to elliptical PCA
Numerical Simulation
Comparisons of various methods
Statistical rates
...and 17 more sections

Key Result

Lemma 1

Under Assumption assum:1, let $\Delta_k = \widetilde{P}_k-P_k^*$, we have as $n_k$, $p\rightarrow \infty$, where $\widetilde{n}_k$ is called the effective sample size of the $k$-th PCA study.

Figures (7)

Figure 1: Average subspace estimation error using various methods under Gaussian distribution with classical PCA, based on 100 replications. From left to right we report S1 (no inclusion of useless sources), S2 (mild inclusion of useless sources) and S3 (severe inclusion of useless sources), respectively.
Figure 2: Averaged subspace estimation error using various methods under $t_3$ distribution with elliptical PCA, based on 100 replications. From left to right we report S1 (no inclusion of useless sources), S2 (mild inclusion of useless sources) and S3 (severe inclusion of useless sources), respectively.
Figure 3: Validation of statistical rates in Theorem \ref{['theo:main']}. From left to right we show how the private term changes with $\delta_{\text{p}}$; how the variance term changes with $n$ and $K$; how the bias term changes with $n$ and how the subspace deviation term changes with $h$.
Figure 4: Histograms of the re-scaled bilinear forms with respect to the empirical spectral projectors acquired before and after knowledge transfer, compared to the standard Gaussian distribution, based on $10000$ replications.
Figure 5: Mean of the Average relative information preservation Ratio (AR) on the testing datasets for different knowledge transfer subspace estimators against the individual PCA estimator $\widetilde{P}_k$ with varying $i=(r_\text{s}-1)/2$, based on 100 replications.
...and 2 more figures

Theorems & Definitions (17)

Lemma 1: Individual PCA error
Theorem 1: Oracle knowledge transfer
Corollary 1: Bilinear forms
Theorem 2: Non-oracle knowledge transfer
Corollary 2: Extension to elliptical PCA
Proposition 1: Lemma 2 from fan2019distributed
proof
Proposition 2: Theorem 2.5 from bosq2000stochastic
Lemma 2: Shared subspace estimator
Lemma 3: Private subspace estimator
...and 7 more

Knowledge Transfer across Multiple Principal Component Analysis Studies

TL;DR

Abstract

Knowledge Transfer across Multiple Principal Component Analysis Studies

Authors

TL;DR

Abstract

Table of Contents

Key Result

Figures (7)

Theorems & Definitions (17)