Wild Bootstrap Inference for Non-Negative Matrix Factorization with Random Effects

Kenichi Satoh

Wild Bootstrap Inference for Non-Negative Matrix Factorization with Random Effects

Kenichi Satoh

TL;DR

Non-negative matrix factorization with random effects with random effects (NMF-RE), a mean-structure latent-variable model that combines covariate-driven scores with unit-specific deviations that illustrates stable, interpretable covariate-effect inference.

Abstract

Non-negative matrix factorization (NMF) is widely used for parts-based representations, yet formal inference for covariate effects is rarely available when the basis is learned under non-negativity. We introduce non-negative matrix factorization with random effects (NMF-RE), a mean-structure latent-variable model $Y=X(ΘA+U)+\mathcal{E}$ that combines covariate-driven scores with unit-specific deviations. Random effects act as a working device for modeling heterogeneity and controlling complexity; we monitor their effective degrees of freedom and enforce a df-based cap to prevent near-saturated fits. Estimation alternates closed-form ridge (BLUP-like) updates for $U$ with multiplicative non-negative updates for $X$ and $Θ$. For inference on $Θ$, we condition on $(\widehat X,\widehat U)$ and obtain fast uncertainty quantification via asymptotic linearization, a one-step Newton update, and a multiplier (wild) bootstrap; this avoids repeated constrained re-optimization. Simulations include a targeted stress test showing that, without df control, the random-effects penalty can collapse and inference for $Θ$ becomes degenerate, whereas the df-cap prevents this failure mode. The non-negativity constraint induces sparse, parts-based loadings -- a measurement-side variable selection -- while inference on $Θ$ identifies which covariates affect which components, providing covariate-side selection. Longitudinal, psychometric, spatial-flow, and text examples further illustrate stable, interpretable covariate-effect inference.

Wild Bootstrap Inference for Non-Negative Matrix Factorization with Random Effects

TL;DR

Abstract

that combines covariate-driven scores with unit-specific deviations. Random effects act as a working device for modeling heterogeneity and controlling complexity; we monitor their effective degrees of freedom and enforce a df-based cap to prevent near-saturated fits. Estimation alternates closed-form ridge (BLUP-like) updates for

with multiplicative non-negative updates for

and

. For inference on

, we condition on

and obtain fast uncertainty quantification via asymptotic linearization, a one-step Newton update, and a multiplier (wild) bootstrap; this avoids repeated constrained re-optimization. Simulations include a targeted stress test showing that, without df control, the random-effects penalty can collapse and inference for

becomes degenerate, whereas the df-cap prevents this failure mode. The non-negativity constraint induces sparse, parts-based loadings -- a measurement-side variable selection -- while inference on

identifies which covariates affect which components, providing covariate-side selection. Longitudinal, psychometric, spatial-flow, and text examples further illustrate stable, interpretable covariate-effect inference.

Paper Structure (107 sections, 50 equations, 4 figures, 9 tables, 1 algorithm)

This paper contains 107 sections, 50 equations, 4 figures, 9 tables, 1 algorithm.

Introduction
Model
Data structure and notation
Coding of covariates.
NMF with covariates
NMF with random effects (NMF-RE)
Unit-level interpretation.
Remark on signs and constraints.
Constraints, normalization, and identifiability
Estimation
Complexity control via the random-effects penalty
Working objective
Block-wise estimation scheme
$U$-step: ridge-type update
Computation.
...and 92 more sections

Figures (4)

Figure 1: Orthodont growth data ($Q=1$): observed measurements and fitted curves. Solid lines show fitted values from the fixed-effects component $\widehat{X}\widehat{\Theta} \boldsymbol a_n$, and dashed lines show BLUP-based fits $\widehat{X}(\widehat{\Theta} \boldsymbol a_n + \widehat{\boldsymbol u}_n)$. Points indicate observations (circles: female; crosses: male).
Figure 2: Holzinger--Swineford cognitive test data ($Q=3$): estimated structure linking observed tests ($Y$), latent components ($X$), and covariates ($A$) under the NMF-RE model.
Figure 3: OD--HUB data ($Q=4$): spatial visualization of the latent components identified by the NMF-RE model. Each destination prefecture is colored according to the dominant latent component based on the fitted latent scores.
Figure 4: Topic model with historical covariates ($Q=3$): estimated topic proportions based on the fixed-effects component $\widehat{X}\widehat{\Theta} A$. Each bar corresponds to one inaugural address (document), illustrating systematic shifts in topic prevalence across historical eras.

Wild Bootstrap Inference for Non-Negative Matrix Factorization with Random Effects

TL;DR

Abstract

Wild Bootstrap Inference for Non-Negative Matrix Factorization with Random Effects

Authors

TL;DR

Abstract

Table of Contents

Figures (4)