Multiple Testing under High-dimensional Dynamic Factor Model
Xinxin Yang, Lilun Du
TL;DR
This work tackles large-scale multiple testing in high-dimensional time series modeled by dynamic factor models with nonlinear serial dependence. It develops a scalable testing framework based on two-pass Fama–MacBeth regression combined with PCA and a chronological sample-splitting scheme that yields symmetric null test statistics via the product $T_i^{(1)}T_i^{(2)}$, enabling a data-driven FDR threshold without explicit long-run variance estimation. The authors establish a new concentration inequality for causal processes, a uniform CLT for the mean estimators, and an FDR control theorem under serial dependence, with extensions to heteroskedastic and non-sparse settings and a bias-correction variant using negative controls. Through extensive simulations and an empirical hedge-fund analysis, the proposed YD method demonstrates robust FDR control and competitive power in the presence of temporal dependence and heavy tails, outperforming several existing methods in challenging settings. The work provides a practical, theoretically sound framework for reliable large-scale inference in dynamic, high-dimensional environments.
Abstract
Large-scale multiple testing under static factor models is widely used to detect sparse signals in high-dimensional data. However, static factor models are arguably too stringent because they ignore serial correlation, which seriously distorts error rate control in large-scale inference. In this manuscript, we propose a new multiple testing procedure under dynamic factor models that is robust to nonlinear serial dependence. The idea is to integrate a new sample-splitting strategy based on chronological order and a two-pass Fama--Macbeth regression to form a series of test statistics with marginal symmetry properties and then to use these properties to obtain a data-driven threshold. We show that our procedure can control the false discovery rate asymptotically under high-dimensional dynamic factor models. {As a byproduct of independent interest, we establish a new exponential-type deviation inequality for the sum of random variables over various functionals of linear and nonlinear processes.} Our numerical results, including a case study on hedge fund selection, demonstrate the advantage of our proposed method over several state-of-the-art methods.
