SOFARI-R: High-Dimensional Manifold-Based Inference for Latent Responses
Zemin Zheng, Xin Zhou, Jinchi Lv
TL;DR
This work develops SOFARI-R, a high-dimensional, manifold-based framework for statistical inference on latent right singular vectors in multi-task regression, addressing the asymmetry between left and right SVD components.Two variants are proposed: SOFARI-R$_s$ for strongly orthogonal factors and a general SOFARI-R for weakly orthogonal factors, both leveraging Neyman near-orthogonality on Stiefel manifolds to obtain bias-corrected, asymptotically normal estimators without requiring an inverse Hessian in the strong-orthogonality setting.Theoretical results establish asymptotic normality and consistent variance estimation in both regimes, with simulation and real-data analyses (FRED-MD) demonstrating accurate inference and interpretable latent response structure across three layers.Overall, SOFARI-R extends the SOFARI framework to latent responses, enabling inference on all significant eigenvectors in high-dimensional multi-task learning and offering practical tools for predictor- and response-focused latent factor analysis.
Abstract
Data reduction with uncertainty quantification plays a key role in various multi-task learning applications, where large numbers of responses and features are present. To this end, a general framework of high-dimensional manifold-based SOFAR inference (SOFARI) was introduced recently in Zheng, Zhou, Fan and Lv (2024) for interpretable multi-task learning inference focusing on the left factor vectors and singular values exploiting the latent singular value decomposition (SVD) structure. Yet, designing a valid inference procedure on the latent right factor vectors is not straightforward from that of the left ones and can be even more challenging due to asymmetry of left and right singular vectors in the response matrix. To tackle these issues, in this paper we suggest a new method of high-dimensional manifold-based SOFAR inference for latent responses (SOFARI-R), where two variants of SOFARI-R are introduced. The first variant deals with strongly orthogonal factors by coupling left singular vectors with the design matrix and then appropriately rescaling them to generate new Stiefel manifolds. The second variant handles the more general weakly orthogonal factors by employing the hard-thresholded SOFARI estimates and delicately incorporating approximation errors into the distribution. Both variants produce bias-corrected estimators for the latent right factor vectors that enjoy asymptotically normal distributions with justified asymptotic variance estimates. We demonstrate the effectiveness of the newly suggested method using extensive simulation studies and an economic application.
