Amortized In-Context Mixed Effect Transformer Models: A Zero-Shot Approach for Pharmacokinetics
César Ali Ojeda Marin, Wilhelm Huisinga, Purity Kavwele, Ramsés J. Sánchez, Niklas Hartung
TL;DR
This work tackles the challenge of sparse, longitudinal pharmacokinetic data by introducing AICMET, a transformer-based latent-variable framework that blends mechanistic compartmental priors with amortized in-context Bayesian inference. By pretraining on large synthetic PK trajectories with Ornstein–Uhlenbeck priors, AICMET achieves zero-shot adaptation to new compounds and provides calibrated, patient-specific predictions after only a few early measurements. The model combines global population codes with individual-specific latent factors and uses a time-aware transformer decoder to handle irregular sampling and dosing information, delivering both accurate forecasts and meaningful uncertainty quantification. Empirical results on PK-DB demonstrate state-of-the-art accuracy and robust inter-patient variability capture, highlighting the potential of population-aware, mechanistically grounded neural architectures for truly personalized pharmacotherapy.
Abstract
Accurate dose-response forecasting under sparse sampling is central to precision pharmacotherapy. We present the Amortized In-Context Mixed-Effect Transformer (AICMET) model, a transformer-based latent-variable framework that unifies mechanistic compartmental priors with amortized in-context Bayesian inference. AICMET is pre-trained on hundreds of thousands of synthetic pharmacokinetic trajectories with Ornstein-Uhlenbeck priors over the parameters of compartment models, endowing the model with strong inductive biases and enabling zero-shot adaptation to new compounds. At inference time, the decoder conditions on the collective context of previously profiled trial participants, generating calibrated posterior predictions for newly enrolled patients after a few early drug concentration measurements. This capability collapses traditional model-development cycles from weeks to hours while preserving some degree of expert modelling. Experiments across public datasets show that AICMET attains state-of-the-art predictive accuracy and faithfully quantifies inter-patient variability -- outperforming both nonlinear mixed-effects baselines and recent neural ODE variants. Our results highlight the feasibility of transformer-based, population-aware neural architectures as offering a new alternative for bespoke pharmacokinetic modeling pipelines, charting a path toward truly population-aware personalized dosing regimens.
