Using latent representations to link disjoint longitudinal data for mixed-effects regression
Clemens Schächter, Maren Hackenberg, Michelle Pfaffenlehner, Félix B. Tambe-Ndonfack, Thorsten Schmidt, Astrid Pechmann, Janbernd Kirschner, Jan Hasenauer, Harald Binder
TL;DR
This work tackles the challenge of analyzing treatment switches in rare diseases when observational data are disjoint due to changing measurement instruments. It introduces a framework that maps multi-instrument observations into a shared latent space via variational autoencoders and then applies a latent multivariate mixed-effects model to capture disease dynamics and switch effects. A bootstrap-knockoff inference procedure enables valid statistical testing despite joint optimization of the latent and mixed-model components, demonstrated on spinal muscular atrophy data with five measurement instruments. The approach yields detectable treatment-switch effects, improves upon naive meta-analysis by leveraging cross-instrument information and larger effective samples, and highlights a promising direction for combining deep learning with classical statistics in small, multi-modal datasets.
Abstract
Many rare diseases offer limited established treatment options, leading patients to switch therapies when new medications emerge. To analyze the impact of such treatment switches within the low sample size limitations of rare disease trials, it is important to use all available data sources. This, however, is complicated when usage of measurement instruments change during the observation period, for example when instruments are adapted to specific age ranges. The resulting disjoint longitudinal data trajectories, complicate the application of traditional modeling approaches like mixed-effects regression. We tackle this by mapping observations of each instrument to a aligned low-dimensional temporal trajectory, enabling longitudinal modeling across instruments. Specifically, we employ a set of variational autoencoder architectures to embed item values into a shared latent space for each time point. Temporal disease dynamics and treatment switch effects are then captured through a mixed-effects regression model applied to latent representations. To enable statistical inference, we present a novel statistical testing approach that accounts for the joint parameter estimation of mixed-effects regression and variational autoencoders. The methodology is applied to quantify the impact of treatment switches for patients with spinal muscular atrophy. Here, our approach aligns motor performance items from different measurement instruments for mixed-effects regression and maps estimated effects back to the observed item level to quantify the treatment switch effect. Our approach allows for model selection as well as for assessing effects of treatment switching. The results highlight the potential of modeling in joint latent representations for addressing small data challenges.
