Table of Contents
Fetching ...

Time-to-Event Modeling with Pseudo-Observations in Federated Settings

Hyojung Jang, Malcolm Risk, Yaojie Wang, Norrina Bai Allen, Xu Shi, Lili Zhao

TL;DR

A covariate-wise debiasing procedure is introduced that shrinks noise-driven local deviations toward the global estimate while preserving genuine site-specific effects and optimizes the bias-variance trade-off, adaptively balancing global stability with the preservation of genuine site-specific deviations.

Abstract

In multi-center clinical research, privacy regulations often prohibit pooling individual-level records, complicating the analysis of time-to-event data. Current federated survival methods frequently require iterative communication or rely strictly on proportional hazards (PH) assumptions or require sensitive survival information. We propose a one-shot federated framework using pseudo-observations derived from a sequentially updated Kaplan-Meier estimator and fitted via a renewable generalized estimating equation. Unlike traditional methods, our approach allows flexible link functions tailored to the target estimand and accommodates non-proportional hazards. To address site-level heterogeneity, we introduce a covariate-wise debiasing procedure that shrinks noise-driven local deviations toward the global estimate while preserving genuine site-specific effects. Simulation studies demonstrate that our framework achieves inferential accuracy comparable to pooled Cox regression and the privacy-preserving One-shot Distributed Algorithm to fit a multicenter Cox proportional hazards model (ODAC) under PH assumptions, while recovering time-varying coefficient trajectories when PH is violated. Furthermore, simulations confirm that the debiasing procedure optimizes the bias-variance trade-off, adaptively balancing global stability with the preservation of genuine site-specific deviations. Applied to pediatric obesity data from the Chicago Area Patient-Centered Outcomes Research Network (CAPriCORN) network ($N=45,865$), the model produced robust estimates of time-invariant and time-varying hazard ratios, offering a flexible, privacy-preserving alternative for collaborative survival research.

Time-to-Event Modeling with Pseudo-Observations in Federated Settings

TL;DR

A covariate-wise debiasing procedure is introduced that shrinks noise-driven local deviations toward the global estimate while preserving genuine site-specific effects and optimizes the bias-variance trade-off, adaptively balancing global stability with the preservation of genuine site-specific deviations.

Abstract

In multi-center clinical research, privacy regulations often prohibit pooling individual-level records, complicating the analysis of time-to-event data. Current federated survival methods frequently require iterative communication or rely strictly on proportional hazards (PH) assumptions or require sensitive survival information. We propose a one-shot federated framework using pseudo-observations derived from a sequentially updated Kaplan-Meier estimator and fitted via a renewable generalized estimating equation. Unlike traditional methods, our approach allows flexible link functions tailored to the target estimand and accommodates non-proportional hazards. To address site-level heterogeneity, we introduce a covariate-wise debiasing procedure that shrinks noise-driven local deviations toward the global estimate while preserving genuine site-specific effects. Simulation studies demonstrate that our framework achieves inferential accuracy comparable to pooled Cox regression and the privacy-preserving One-shot Distributed Algorithm to fit a multicenter Cox proportional hazards model (ODAC) under PH assumptions, while recovering time-varying coefficient trajectories when PH is violated. Furthermore, simulations confirm that the debiasing procedure optimizes the bias-variance trade-off, adaptively balancing global stability with the preservation of genuine site-specific deviations. Applied to pediatric obesity data from the Chicago Area Patient-Centered Outcomes Research Network (CAPriCORN) network (), the model produced robust estimates of time-invariant and time-varying hazard ratios, offering a flexible, privacy-preserving alternative for collaborative survival research.

Paper Structure

This paper contains 12 sections, 12 equations, 5 figures, 1 table.

Figures (5)

  • Figure 1: Boxplots of bias in the estimated treatment log-HR across 500 simulation replicates under the PH setting ($N = 7{,}500$). The left and right panels correspond to 10% and 30% event rates, respectively. The x-axis denotes the site configuration (e.g., "$1000 \times 5$" indicates 5 sites with 1,000 subjects per site). The pooled Cox model serves as the reference.
  • Figure 2: Estimation of time-varying treatment effects under non-PH settings. The orange curve represents the true time-varying log-HR induced by the Weibull data-generating mechanism. Green boxplots show the distribution of estimates from the federated model at landmark times corresponding to the 20th--80th percentiles of the survival distribution ($N=5{,}500, K=20$).
  • Figure 3: Simulation-average RMSE of site-specific treatment effect estimates under sparse site heterogeneity ($K=50, n_k=100$). The horizontal axis shows the outlier magnitude $\tau$, and panels correspond to increasing proportions of heterogeneous sites $\pi$. We compare the global federated estimator (red line), the local site-wise estimator (black line), and the proposed debiased estimator with SURE-selected soft-thresholding (purple dashed line). Each point represents an average over 500 simulation replicates.
  • Figure 4: Forest plots of point estimates with 95% confidence intervals for each baseline covariate (excluding age and BMI percentile) from the pooled (red) and federated (blue) pseudo-observation regressions. Panels are faceted by covariate and illustrate that the federated algorithm recovers the same constant log-hazard ratios as the pooled analysis. Below, time-varying coefficient trajectories for baseline age and BMI percentile are shown over a grid of prespecified landmark times; each panel overlays federated (blue) and pooled (red) log-hazard ratio estimates, demonstrating near-identical temporal patterns.
  • Figure 5: Comparison of local (blue) and debiased (red) beta estimates across four sites for two covariates: comorbidity (top) and age (bottom). The dashed lines indicate the federated global estimates, which serve as the baseline reference for shrinkage. In the top panel, the vertical movement from a blue dot to a red dot visually represents the degree of shrinkage applied to the time-invariant comorbidity effect at each site. In the bottom panel, the trajectories plotted against follow-up time illustrate how the local time-varying effects of age are adjusted at each prespecified landmark time toward the global trajectory.