Table of Contents
Fetching ...

Tensor PCA for Factor Models

Andrii Babii, Eric Ghysels, Junsu Pan

TL;DR

This work generalizes factor modeling to multidimensional tensor data via a Tucker tensor factor model and proposes simple tensor PCA (TPCA) for estimating factors and loadings. Under a strong factor regime, TPCA achieves rate-optimal convergence, while alternating least-squares (ALS) improves estimation when factors are moderately weak. The paper also develops a formal test for the number of factors across tensor modes and demonstrates valid inference for loadings and factors. An empirical application to missing firm characteristics shows TPCA-based imputation can outperform conventional cross-sectional methods, highlighting practical gains in multidimensional panel settings.

Abstract

Modern empirical analysis often relies on high-dimensional panel datasets with non-negligible cross-sectional and time-series correlations. Factor models are natural for capturing such dependencies. A tensor factor model describes the $d$-dimensional panel as a sum of a reduced rank component and an idiosyncratic noise, generalizing traditional factor models for two-dimensional panels. We consider a tensor factor model corresponding to the notion of a reduced multilinear rank of a tensor. We show that for a strong factor model, a simple tensor principal component analysis algorithm is optimal for estimating factors and loadings. When the factors are weak, the convergence rate of simple TPCA can be improved with alternating least-squares iterations. We also provide inferential results for factors and loadings and propose the first test to select the number of factors. The new tools are applied to the problem of imputing missing values in a multidimensional panel of firm characteristics.

Tensor PCA for Factor Models

TL;DR

This work generalizes factor modeling to multidimensional tensor data via a Tucker tensor factor model and proposes simple tensor PCA (TPCA) for estimating factors and loadings. Under a strong factor regime, TPCA achieves rate-optimal convergence, while alternating least-squares (ALS) improves estimation when factors are moderately weak. The paper also develops a formal test for the number of factors across tensor modes and demonstrates valid inference for loadings and factors. An empirical application to missing firm characteristics shows TPCA-based imputation can outperform conventional cross-sectional methods, highlighting practical gains in multidimensional panel settings.

Abstract

Modern empirical analysis often relies on high-dimensional panel datasets with non-negligible cross-sectional and time-series correlations. Factor models are natural for capturing such dependencies. A tensor factor model describes the -dimensional panel as a sum of a reduced rank component and an idiosyncratic noise, generalizing traditional factor models for two-dimensional panels. We consider a tensor factor model corresponding to the notion of a reduced multilinear rank of a tensor. We show that for a strong factor model, a simple tensor principal component analysis algorithm is optimal for estimating factors and loadings. When the factors are weak, the convergence rate of simple TPCA can be improved with alternating least-squares iterations. We also provide inferential results for factors and loadings and propose the first test to select the number of factors. The new tools are applied to the problem of imputing missing values in a multidimensional panel of firm characteristics.
Paper Structure (27 sections, 12 theorems, 142 equations, 10 figures, 3 tables, 2 algorithms)

This paper contains 27 sections, 12 theorems, 142 equations, 10 figures, 3 tables, 2 algorithms.

Key Result

Proposition 3.1

Under Assumption as:orthogonal (i)

Figures (10)

  • Figure 1: A scalar, and tensors of order $1$ (vector), $2$ (matrix), and $3$
  • Figure 2: Mode-$1,2$ and $3$fibers of a $4\times 5\times 3$ tensor
  • Figure 3: Estimation Accuracy of Tensor PCA: Changing Sizes of Dimensions
  • Figure 4: Asymptotic Distribution Reconciliation
  • Figure 5: Asymptotic Distribution Reconciliation (continued)
  • ...and 5 more figures

Theorems & Definitions (32)

  • Definition 2.1
  • Remark 2.1
  • Remark 2.2
  • Proposition 3.1
  • Theorem 3.1
  • Remark 3.1
  • Remark 3.2
  • Remark 3.3
  • Theorem 3.2
  • Remark 3.4
  • ...and 22 more