Near-Efficient and Non-Asymptotic Multiway Inference
Oscar López, Arvind Prasadan, Carlos Llosa-Vite, Richard B. Lehoucq, Daniel M. Dunlavy
TL;DR
This work develops a non-asymptotic, CRLB-based framework to benchmark tensor CP-based inference for Poisson count data. It shows that, in the rank-one CP setting, a shifted-Poisson regression estimator achieves near-efficiency with variance close to the CRLB up to absolute constants and logarithmic factors, while higher CP ranks exhibit a CRLB gap for factor estimation though full parametric inference remains nearly minimax optimal with favorable rank dependence. The authors derive non-asymptotic MSE and Fisher information bounds, provide a minimax lower bound, and validate the theory with numerical experiments, clarifying when CP-based multiway analysis is reliable in finite samples. The results offer practical non-asymptotic benchmarks for CP-based inference in high-dimensional tensor Poisson models and motivate future work on higher-rank efficient estimators and broader noisy tensor settings.
Abstract
We establish non-asymptotic efficiency guarantees for tensor decomposition-based inference in count data models. Under a Poisson framework, we consider two related goals: (i) parametric inference, the estimation of the full distributional parameter tensor, and (ii) multiway analysis, the recovery of its canonical polyadic (CP) decomposition factors. Our main result shows that in the rank-one setting, a rank-constrained maximum-likelihood estimator achieves multiway analysis with variance matching the Cramér-Rao Lower Bound (CRLB) up to absolute constants and logarithmic factors. This provides a general framework for studying "near-efficient" multiway estimators in finite-sample settings. For higher ranks, we illustrate that our multiway estimator may not attain the CRLB; nevertheless, CP-based parametric inference remains nearly minimax optimal, with error bounds that improve on prior work by offering more favorable dependence on the CP rank. Numerical experiments corroborate near-efficiency in the rank-one case and highlight the efficiency gap in higher-rank scenarios.
