Table of Contents
Fetching ...

CoMET: A Compressed Bayesian Mixed-Effects Model for High-Dimensional Tensors

Sreya Sarkar, Kshitij Khare, Sanvesh Srivastava

TL;DR

High-dimensional theoretical guarantees are established by identifying regularity conditions under which CoMET's posterior predictive risk decays to zero, and the model outperforms penalized competitors across a range of simulation studies and two benchmark applications involving facial-expression prediction and music emotion modeling.

Abstract

Mixed-effects models are fundamental tools for analyzing clustered and repeated-measures data, but existing high-dimensional methods largely focus on penalized estimation with vector-valued covariates. Bayesian alternatives in this regime are limited, with no sampling-based mixed-effects framework that supports tensor-valued fixed- and random-effects covariates while remaining computationally tractable. We propose the Compressed Mixed-Effects Tensor (CoMET) model for high-dimensional repeated-measures data with scalar responses and tensor-valued covariates. CoMET performs structured, mode-wise random projection of the random-effects covariance, yielding a low-dimensional covariance parameter that admits simple Gaussian prior specification and enables efficient imputation of compressed random-effects. For the mean structure, CoMET leverages a low-rank tensor decomposition and margin-structured Horseshoe priors to enable fixed-effects selection. These design choices lead to an efficient collapsed Gibbs sampler whose computational complexity grows approximately linearly with the tensor covariate dimensions. We establish high-dimensional theoretical guarantees by identifying regularity conditions under which CoMET's posterior predictive risk decays to zero. Empirically, CoMET outperforms penalized competitors across a range of simulation studies and two benchmark applications involving facial-expression prediction and music emotion modeling.

CoMET: A Compressed Bayesian Mixed-Effects Model for High-Dimensional Tensors

TL;DR

High-dimensional theoretical guarantees are established by identifying regularity conditions under which CoMET's posterior predictive risk decays to zero, and the model outperforms penalized competitors across a range of simulation studies and two benchmark applications involving facial-expression prediction and music emotion modeling.

Abstract

Mixed-effects models are fundamental tools for analyzing clustered and repeated-measures data, but existing high-dimensional methods largely focus on penalized estimation with vector-valued covariates. Bayesian alternatives in this regime are limited, with no sampling-based mixed-effects framework that supports tensor-valued fixed- and random-effects covariates while remaining computationally tractable. We propose the Compressed Mixed-Effects Tensor (CoMET) model for high-dimensional repeated-measures data with scalar responses and tensor-valued covariates. CoMET performs structured, mode-wise random projection of the random-effects covariance, yielding a low-dimensional covariance parameter that admits simple Gaussian prior specification and enables efficient imputation of compressed random-effects. For the mean structure, CoMET leverages a low-rank tensor decomposition and margin-structured Horseshoe priors to enable fixed-effects selection. These design choices lead to an efficient collapsed Gibbs sampler whose computational complexity grows approximately linearly with the tensor covariate dimensions. We establish high-dimensional theoretical guarantees by identifying regularity conditions under which CoMET's posterior predictive risk decays to zero. Empirically, CoMET outperforms penalized competitors across a range of simulation studies and two benchmark applications involving facial-expression prediction and music emotion modeling.
Paper Structure (20 sections, 7 theorems, 94 equations, 11 figures, 2 tables, 1 algorithm)

This paper contains 20 sections, 7 theorems, 94 equations, 11 figures, 2 tables, 1 algorithm.

Key Result

Theorem 3.1

If the assumptions (A1)-(A3) hold, $p^* = o(N)$, $k^{*2}\log k^* = o(p^*)$, $k^{*2}\log \log N = o(p^*)$ and $\|\bm{\beta}^0 \|^2 = o(N)$, then the posterior predictive risk satisfies where $\kappa(X)$ denotes the condition number of $X$.

Figures (11)

  • Figure 1: Comparison of RMSE and RMSPE, defined in \ref{['eq:rmse_rmspe']}. CoMET substantially outperforms existing penalized quasi-likelihood methods in estimation of $\mathcal{B}$ (RMSE) and out-of-sample prediction (RMSPE) across various choices of fixed-effect rank $K$ and cluster sizes $m \in \{3, 6, 9, 12\}$. Results are presented for compressed covariance dimension $k = 3$, summarized over 25 replications. CoMET, the compressed mixed-effects tensor model; oracle, the oracle benchmark; PQL-1, the penalized quasi-likelihood approach of FanLi12; PQL-2, the penalized quasi-likelihood approach of Lietal21; GEE, the penalized generalized estimating equations method of 2019_Zhang_etal.
  • Figure 2: Fixed-effects inference comparisons. The CoMET model produces narrower credible intervals for $\mathcal{B}$ entries than the PQL methods when $k = 3$ and $K \in \{4, 6, 8\}$, along with achieving near-nominal coverage across all cluster sizes $m \in \{3, 6, 9, 12\}$. All metrics are summarized over 25 replications. CoMET, the compressed mixed-effects tensor model; oracle, the oracle variant of CoMET; PQL-1, the penalized quasi-likelihood approach of FanLi12; PQL-2, the penalized quasi-likelihood approach of Lietal21. GEE, the penalized generalized estimating equations approach of 2019_Zhang_etal, is excluded because of absence of debiasing technique for confidence intervals construction.
  • Figure 3: Posterior predictive interval widths of CoMET relative to those of the oracle. The 95% prediction intervals produced by CoMET are comparable in width to those by the oracle benchmark for all choices of cluster size ($m$), covariance compression dimension ($k$), and fixed-effect rank ($K$). For each replication, relative width is computed by dividing the width of a prediction interval of CoMET by that of oracle. Across 25 replicates, the relative widths show minimal variability with standard deviations on the order of $10^{-2}$.
  • Figure 4: Comparison of RMSPE, defined in \ref{['eq:rmse_rmspe']}, in real-data applications (Left: DEAM data and Right: LFW data). The CoMET model demonstrates either competitive or superior predictive accuracy across all covariance compression dimensions ($k$) and fixed-effects rank ($K$) compared to the penalized methods, PQL-1 FanLi12, PQL-2 Lietal21 and GEE 2019_Zhang_etal. Results are summarized over 25 random splits of each of the datasets.
  • Figure 5: Estimated $12 \times 4$ matrix-valued fixed-effect coefficient $\mathcal{B}$ (transposed for visualization), averaged over 25 random training splits of the DEAM dataset. CoMET, the compressed mixed-effects model for tensors; PQL-1, the penalized quasi-likelihood method of FanLi12; and PQL-2, the penalized quasi-likelihood approach of Lietal21; GEE, the penalized generalized estimating equations approach of 2019_Zhang_etal.
  • ...and 6 more figures

Theorems & Definitions (7)

  • Theorem 3.1
  • Theorem A.1
  • Lemma A.1
  • Lemma A.2
  • Lemma A.3
  • Lemma A.4
  • Lemma A.5