Robust bilinear factor analysis based on the matrix-variate $t$ distribution
Xuan Ma, Jianhua Zhao, Changchun Shang, Fen Jiang, Philip L. H. Yu
TL;DR
The paper addresses robustness for matrix-valued data by deriving tBFA, a robust bilinear factor analysis based on the matrix-variate $t$ distribution. It develops two EM-type ML estimation algorithms (ECME and AECM) and introduces parameter-expanded variants (PX-ECME and PX-AECM) to accelerate convergence, along with a closed-form Fisher information matrix to quantify estimator precision. Empirical results on synthetic and real data demonstrate superior robustness and interpretability of tBFA compared with vector-based $t$FA and Gaussian matrix-factor models, including a higher breakdown point. The work advances robust matrix modeling by preserving matrix structure, enabling simultaneous row and column factor extraction, and offering practical tools for reliable factor analysis in heavy-tailed settings. It also lays groundwork for extensions to tensors and mixtures, broadening applicability to complex multiway data.
Abstract
Factor Analysis based on multivariate $t$ distribution ($t$fa) is a useful robust tool for extracting common factors on heavy-tailed or contaminated data. However, $t$fa is only applicable to vector data. When $t$fa is applied to matrix data, it is common to first vectorize the matrix observations. This introduces two challenges for $t$fa: (i) the inherent matrix structure of the data is broken, and (ii) robustness may be lost, as vectorized matrix data typically results in a high data dimension, which could easily lead to the breakdown of $t$fa. To address these issues, starting from the intrinsic matrix structure of matrix data, a novel robust factor analysis model, namely bilinear factor analysis built on the matrix-variate $t$ distribution ($t$bfa), is proposed in this paper. The novelty is that it is capable to simultaneously extract common factors for both row and column variables of interest on heavy-tailed or contaminated matrix data. Two efficient algorithms for maximum likelihood estimation of $t$bfa are developed. Closed-form expression for the Fisher information matrix to calculate the accuracy of parameter estimates are derived. Empirical studies are conducted to understand the proposed $t$bfa model and compare with related competitors. The results demonstrate the superiority and practicality of $t$bfa. Importantly, $t$bfa exhibits a significantly higher breakdown point than $t$fa, making it more suitable for matrix data.
