Table of Contents
Fetching ...

Comparing Federated Stochastic Gradient Descent and Federated Averaging for Predicting Hospital Length of Stay

Mehmet Yigit Balik

TL;DR

This work tackles predicting hospital length of stay under data privacy constraints by modeling hospitals as nodes in an empirical graph and training local linear predictors under generalized total variation minimization (GTVMin). It compares FedSGD and two FedAVG variants for federated optimization on decentralized hospital data, with a graph-regularized objective that links similar local models. Results show FedSGD achieving the lowest MSE on training, validation, and test sets (approximately 1.354–1.407 across metrics), outperforming FedAVGv1 and FedAVGv2, likely due to data heterogeneity across facilities. The study demonstrates that privacy-preserving federated learning with graph-based regularization can yield accurate LOS predictions without sharing sensitive patient data, and suggests future work on hyperparameter optimization, alternative discrepancy measures, and broader decentralization.

Abstract

Predicting hospital length of stay (LOS) reliably is an essential need for efficient resource allocation at hospitals. Traditional predictive modeling tools frequently have difficulty acquiring sufficient and diverse data because healthcare institutions have privacy rules in place. In our study, we modeled this problem as an empirical graph where nodes are the hospitals. This modeling approach facilitates collaborative model training by modeling decentralized data sources from different hospitals without extracting sensitive data outside of hospitals. A local model is trained on a node (hospital) by aiming the generalized total variation minimization (GTVMin). Moreover, we implemented and compared two different federated learning optimization algorithms named federated stochastic gradient descent (FedSGD) and federated averaging (FedAVG). Our results show that federated learning enables accurate prediction of hospital LOS while addressing privacy concerns without extracting data outside healthcare institutions.

Comparing Federated Stochastic Gradient Descent and Federated Averaging for Predicting Hospital Length of Stay

TL;DR

This work tackles predicting hospital length of stay under data privacy constraints by modeling hospitals as nodes in an empirical graph and training local linear predictors under generalized total variation minimization (GTVMin). It compares FedSGD and two FedAVG variants for federated optimization on decentralized hospital data, with a graph-regularized objective that links similar local models. Results show FedSGD achieving the lowest MSE on training, validation, and test sets (approximately 1.354–1.407 across metrics), outperforming FedAVGv1 and FedAVGv2, likely due to data heterogeneity across facilities. The study demonstrates that privacy-preserving federated learning with graph-based regularization can yield accurate LOS predictions without sharing sensitive patient data, and suggests future work on hyperparameter optimization, alternative discrepancy measures, and broader decentralization.

Abstract

Predicting hospital length of stay (LOS) reliably is an essential need for efficient resource allocation at hospitals. Traditional predictive modeling tools frequently have difficulty acquiring sufficient and diverse data because healthcare institutions have privacy rules in place. In our study, we modeled this problem as an empirical graph where nodes are the hospitals. This modeling approach facilitates collaborative model training by modeling decentralized data sources from different hospitals without extracting sensitive data outside of hospitals. A local model is trained on a node (hospital) by aiming the generalized total variation minimization (GTVMin). Moreover, we implemented and compared two different federated learning optimization algorithms named federated stochastic gradient descent (FedSGD) and federated averaging (FedAVG). Our results show that federated learning enables accurate prediction of hospital LOS while addressing privacy concerns without extracting data outside healthcare institutions.
Paper Structure (16 sections, 14 equations, 2 figures, 5 tables)

This paper contains 16 sections, 14 equations, 2 figures, 5 tables.

Figures (2)

  • Figure 1: Assumed Empirical Graph in FedAVG
  • Figure 2: Constructed Empirical Graph Used in FedSGD