A Multiparty Homomorphic Encryption Approach to Confidential Federated Kaplan Meier Survival Analysis

Narasimha Raghavan Veeraragavan; Svetlana Boudko; Jan Franz Nygård

A Multiparty Homomorphic Encryption Approach to Confidential Federated Kaplan Meier Survival Analysis

Narasimha Raghavan Veeraragavan, Svetlana Boudko, Jan Franz Nygård

TL;DR

This work tackles privacy-preserving federated Kaplan–Meier survival analysis by introducing a multiparty threshold CKKS homomorphic encryption framework with native floating-point support. It provides a formal utility-loss and convergence analysis, along with explicit reconstruction-attack mitigation, and validates the approach on NCCTG Lung Cancer and synthetic Breast Cancer datasets. The results show that encrypted federated KM estimates closely match centralized or non-encrypted federated results, preserving statistical validity (e.g., log-rank tests) while enabling secure aggregation across up to 50 sites, albeit with an 8–19× runtime overhead. The study demonstrates that reconstruction attacks are most potent in small federations with high data overlap but are significantly weakened as federation size grows or overlaps decrease, highlighting the practical privacy benefits of threshold HE in multi-institutional survival analysis. Overall, the framework advances secure, high-fidelity federated survival analysis with formal guarantees and practical applicability for moderate-scale collaborations.

Abstract

The proliferation of healthcare data has expanded opportunities for collaborative research, yet stringent privacy regulations hinder pooling sensitive patient records. We propose a \emph{multiparty homomorphic encryption-based} framework for \emph{privacy-preserving federated Kaplan--Meier survival analysis}, offering native floating-point support, a theoretical model, and explicit reconstruction-attack mitigation. Compared to prior work, our framework ensures encrypted federated survival estimates closely match centralized outcomes, supported by formal utility-loss bounds that demonstrate convergence as aggregation and decryption noise diminish. Extensive experiments on the NCCTG Lung Cancer and synthetic Breast Cancer datasets confirm low \emph{mean absolute error (MAE)} and \emph{root mean squared error (RMSE)}, indicating negligible deviations between encrypted and non-encrypted survival curves. Log-rank and numerical accuracy tests reveal \emph{no significant difference} between federated encrypted and non-encrypted analyses, preserving statistical validity. A reconstruction-attack evaluation shows smaller federations (2--3 providers) with overlapping data between the institutions are vulnerable, a challenge mitigated by multiparty encryption. Larger federations (5--50 sites) degrade reconstruction accuracy further, with encryption improving confidentiality. Despite an 8--19$\times$ computational overhead, threshold-based homomorphic encryption is \emph{feasible for moderate-scale deployments}, balancing security and runtime. By providing robust privacy guarantees alongside high-fidelity survival estimates, our framework advances the state-of-the art in secure multi-institutional survival analysis.

A Multiparty Homomorphic Encryption Approach to Confidential Federated Kaplan Meier Survival Analysis

TL;DR

Abstract

computational overhead, threshold-based homomorphic encryption is \emph{feasible for moderate-scale deployments}, balancing security and runtime. By providing robust privacy guarantees alongside high-fidelity survival estimates, our framework advances the state-of-the art in secure multi-institutional survival analysis.

Paper Structure (55 sections, 4 theorems, 11 equations, 7 figures, 4 tables, 5 algorithms)

This paper contains 55 sections, 4 theorems, 11 equations, 7 figures, 4 tables, 5 algorithms.

Introduction
Summary of Contributions
Related Work
Background
Homomorphic Encryption
Survival Analysis: Kaplan Meier Estimation
Survival Function and Time-to-Event Analysis
Key Concepts in Kaplan-Meier Estimation
Methodology of the Kaplan-Meier Estimator
Interpreting the Kaplan-Meier Curve
Applications of Kaplan-Meier Estimation
Problem Formulation
Challenges in Federated Kaplan--Meier Estimation
Proposed Solution
Assumptions, Definitions, and Notations
...and 40 more sections

Key Result

Theorem 1

Let $\hat{S}_{\text{centralized}}(t)$ and $\hat{S}_{\text{federated}}(t)$ denote the Kaplan-Meier survival probabilities estimated at time $t$ in centralized and federated homomorphic encrypted (HE) settings, respectively. Assume: Under these assumptions, the utility loss $\Delta S(t)$ and error growth can be analyzed as follows: Utility Loss Bound: where $M_k \propto D_k$ reflects the relationsh

Figures (7)

Figure 1: Centralized, Survival Curves for the Lung and Breast Cancer datasets
Figure 2: Federated and Federated Encrypted Survival Curves for the Lung Cancer dataset at Varying Client Counts
Figure 3: Federated and Federated Encrypted Survival Curves for the Breast Cancer dataset at Varying Client Counts
Figure 4: Numerical Accuracy Results for the Lung and Breast Cancer datasets
Figure 5: NCCTG Lung Cancer Dataset: Reconstruction metrics comparing None, Small, and Large Overlap for $d_t$ and $n_{\text{at\_risk}}$. The x-axis is the number of providers, and each curve shows how an attacker's reconstruction quality changes with the federation size.
...and 2 more figures

Theorems & Definitions (8)

Theorem 1: Utility Loss Bound in Federated Homomorphic Encrypted Kaplan-Meier Estimators for Worst Case Scenario
proof
Theorem 2: Utility Loss Bound for Bounded At-Risk Counts with Noise Terms
proof
Theorem 3: Convergence of Federated Kaplan-Meier Estimators
proof
Theorem 4: Scalability of Noise with Client Count
proof

A Multiparty Homomorphic Encryption Approach to Confidential Federated Kaplan Meier Survival Analysis

TL;DR

Abstract

A Multiparty Homomorphic Encryption Approach to Confidential Federated Kaplan Meier Survival Analysis

Authors

TL;DR

Abstract

Table of Contents

Key Result

Figures (7)

Theorems & Definitions (8)