Table of Contents
Fetching ...

Using latent representations to link disjoint longitudinal data for mixed-effects regression

Clemens Schächter, Maren Hackenberg, Michelle Pfaffenlehner, Félix B. Tambe-Ndonfack, Thorsten Schmidt, Astrid Pechmann, Janbernd Kirschner, Jan Hasenauer, Harald Binder

TL;DR

This work tackles the challenge of analyzing treatment switches in rare diseases when observational data are disjoint due to changing measurement instruments. It introduces a framework that maps multi-instrument observations into a shared latent space via variational autoencoders and then applies a latent multivariate mixed-effects model to capture disease dynamics and switch effects. A bootstrap-knockoff inference procedure enables valid statistical testing despite joint optimization of the latent and mixed-model components, demonstrated on spinal muscular atrophy data with five measurement instruments. The approach yields detectable treatment-switch effects, improves upon naive meta-analysis by leveraging cross-instrument information and larger effective samples, and highlights a promising direction for combining deep learning with classical statistics in small, multi-modal datasets.

Abstract

Many rare diseases offer limited established treatment options, leading patients to switch therapies when new medications emerge. To analyze the impact of such treatment switches within the low sample size limitations of rare disease trials, it is important to use all available data sources. This, however, is complicated when usage of measurement instruments change during the observation period, for example when instruments are adapted to specific age ranges. The resulting disjoint longitudinal data trajectories, complicate the application of traditional modeling approaches like mixed-effects regression. We tackle this by mapping observations of each instrument to a aligned low-dimensional temporal trajectory, enabling longitudinal modeling across instruments. Specifically, we employ a set of variational autoencoder architectures to embed item values into a shared latent space for each time point. Temporal disease dynamics and treatment switch effects are then captured through a mixed-effects regression model applied to latent representations. To enable statistical inference, we present a novel statistical testing approach that accounts for the joint parameter estimation of mixed-effects regression and variational autoencoders. The methodology is applied to quantify the impact of treatment switches for patients with spinal muscular atrophy. Here, our approach aligns motor performance items from different measurement instruments for mixed-effects regression and maps estimated effects back to the observed item level to quantify the treatment switch effect. Our approach allows for model selection as well as for assessing effects of treatment switching. The results highlight the potential of modeling in joint latent representations for addressing small data challenges.

Using latent representations to link disjoint longitudinal data for mixed-effects regression

TL;DR

This work tackles the challenge of analyzing treatment switches in rare diseases when observational data are disjoint due to changing measurement instruments. It introduces a framework that maps multi-instrument observations into a shared latent space via variational autoencoders and then applies a latent multivariate mixed-effects model to capture disease dynamics and switch effects. A bootstrap-knockoff inference procedure enables valid statistical testing despite joint optimization of the latent and mixed-model components, demonstrated on spinal muscular atrophy data with five measurement instruments. The approach yields detectable treatment-switch effects, improves upon naive meta-analysis by leveraging cross-instrument information and larger effective samples, and highlights a promising direction for combining deep learning with classical statistics in small, multi-modal datasets.

Abstract

Many rare diseases offer limited established treatment options, leading patients to switch therapies when new medications emerge. To analyze the impact of such treatment switches within the low sample size limitations of rare disease trials, it is important to use all available data sources. This, however, is complicated when usage of measurement instruments change during the observation period, for example when instruments are adapted to specific age ranges. The resulting disjoint longitudinal data trajectories, complicate the application of traditional modeling approaches like mixed-effects regression. We tackle this by mapping observations of each instrument to a aligned low-dimensional temporal trajectory, enabling longitudinal modeling across instruments. Specifically, we employ a set of variational autoencoder architectures to embed item values into a shared latent space for each time point. Temporal disease dynamics and treatment switch effects are then captured through a mixed-effects regression model applied to latent representations. To enable statistical inference, we present a novel statistical testing approach that accounts for the joint parameter estimation of mixed-effects regression and variational autoencoders. The methodology is applied to quantify the impact of treatment switches for patients with spinal muscular atrophy. Here, our approach aligns motor performance items from different measurement instruments for mixed-effects regression and maps estimated effects back to the observed item level to quantify the treatment switch effect. Our approach allows for model selection as well as for assessing effects of treatment switching. The results highlight the potential of modeling in joint latent representations for addressing small data challenges.

Paper Structure

This paper contains 14 sections, 17 equations, 3 figures, 3 tables.

Figures (3)

  • Figure 1: Schematic illustration of the proposed approach for multiple measurement instruments: 1) The items of each measurement instrument are encoded by a dedicated encoder network and corresponding latent values are drawn. 2) The latent values are averaged across instruments at each time step, to obtain a joint latent trajectory. 3) The averaged latent values serve as outcome variable for a multivariate mixed-effects regression, which provides BLUE and BLUP estimators. 4) Predictions from the mixed-effects model serve as input to a dedicated decoder network for each measurement instrument, for reconstruction at the original item level. 5) Treatment switch effects can be quantified per item. 6) A likelihood-ratio test provides statistical inference.
  • Figure 2: Comparison of total HINE-2 sum scores for five patient trajectories. Observed data trajectories are displayed in black, the respective mixed model prediction with treatment switch in red, and mixed model predictions for a hypothetical trajectory without treatment switch in blue. The reconstructed trajectories of the latent mixed model are displayed in the upper subplot and for a standard data-level mixed model in the lower subplot.
  • Figure 3: Empirical cumulative distribution functions of the null distributions of the likelihood ratio test statistic from artificially added knockoff variables for fixed (left column) and random effects (right column) in the mixed-effects regression in latent representations of dimension $d=1$ (top row) and $d=3$ (bottom row). The red line represents the theoretical chi-squared distribution that does not take into account the interdependent VAE and mixed model training procedure.