Table of Contents
Fetching ...

Mixed Effects Mixture of Experts: Modeling Double Heterogeneous Trajectories

Xinkai Yue, Xiaodong Yan, Haohui Han, Liya Fu

TL;DR

This work proposes a novel statistical framework by using a large model prototype: a mixed effects mixture of experts model (MEMoE), which outperforms both traditional single-population LMM and conventional Mixture of Experts models in terms of parameter recovery, classification accuracy, and overall model fit.

Abstract

Linear mixed-effects model (LMM) is a cornerstone of longitudinal data analysis, but is limited to adeptly make heterogeneous analyses predictable under both group-specific fixed effects and subject-specific random effects. To address this challenge, we propose a novel statistical framework by using a large model prototype: a mixed effects mixture of experts model (MEMoE). This framework integrates the divide-and-conquer paradigm of Mixture of Experts Models with classical mixed-effect modeling. In the proposed MEMoE, each expert is a full LMM dedicated to capturing the longitudinal trajectory of a specific latent subpopulation, while another model gating function learns to route subjects to the most appropriate expert in a data-driven manner based on baseline covariates. We develop a robust inferential procedure for parameter estimation based on the Laplace Expectation-Maximization algorithm, with standard errors calibrated using robust sandwich estimators to account for potential model misspecification. Extensive simulation studies and an empirical application demonstrate that MEMoE outperforms both traditional single-population LMM and conventional Mixture of Experts models in terms of parameter recovery, classification accuracy, and overall model fit.

Mixed Effects Mixture of Experts: Modeling Double Heterogeneous Trajectories

TL;DR

This work proposes a novel statistical framework by using a large model prototype: a mixed effects mixture of experts model (MEMoE), which outperforms both traditional single-population LMM and conventional Mixture of Experts models in terms of parameter recovery, classification accuracy, and overall model fit.

Abstract

Linear mixed-effects model (LMM) is a cornerstone of longitudinal data analysis, but is limited to adeptly make heterogeneous analyses predictable under both group-specific fixed effects and subject-specific random effects. To address this challenge, we propose a novel statistical framework by using a large model prototype: a mixed effects mixture of experts model (MEMoE). This framework integrates the divide-and-conquer paradigm of Mixture of Experts Models with classical mixed-effect modeling. In the proposed MEMoE, each expert is a full LMM dedicated to capturing the longitudinal trajectory of a specific latent subpopulation, while another model gating function learns to route subjects to the most appropriate expert in a data-driven manner based on baseline covariates. We develop a robust inferential procedure for parameter estimation based on the Laplace Expectation-Maximization algorithm, with standard errors calibrated using robust sandwich estimators to account for potential model misspecification. Extensive simulation studies and an empirical application demonstrate that MEMoE outperforms both traditional single-population LMM and conventional Mixture of Experts models in terms of parameter recovery, classification accuracy, and overall model fit.
Paper Structure (16 sections, 3 theorems, 180 equations, 5 figures, 4 tables, 1 algorithm)

This paper contains 16 sections, 3 theorems, 180 equations, 5 figures, 4 tables, 1 algorithm.

Key Result

Theorem 1

Let $\hat{\Psi}$ denote the estimator obtained by maximizing $\ell_{{\rm LA}}(\Psi)$ over the parameter space. Under the regularity conditions (A1)–(A5), $\hat{\Psi}$ is a consistent estimator of $\Psi$ as $N\to\infty$; that is: $\hat{\Psi} \xrightarrow{p} \Psi.$

Figures (5)

  • Figure 1: Structure of the MEMoE. Pink blocks represent data inputs, yellow blocks denote the created model framework, and the green block signifies the model output.
  • Figure 2: Prediction mean square error under MEMoE, MoE, ReMoE and LMM for Examples 1 and 2. LMM is deleted in Examples 1 because it shows larger PMSE and all results imply that MEMoE attains oracle estimation performance.
  • Figure 3: Prediction mean square error (a),(c) and (e) under MEMoE, MoE, ReMoE and LMM for Cases 1-3 of Example 3 and MEMoE shows the best prediction performance. MEMoE is also robust about the random effect variance in (b), (d) and (f).
  • Figure 4: The histogram of residuals from the LMM.
  • Figure 5: Plot of MEMoE 95% prediction sets for the PBCseq bilirubin. The red bars are prediction sets for each observation. The dark and light blue points correspond to the covered and missed $\log(\text{bilirubin})$ values by the prediction sets, respectively.

Theorems & Definitions (7)

  • Theorem 1
  • Theorem 2
  • proof
  • proof
  • Lemma 1
  • proof
  • proof