Communication-Efficient Hybrid Federated Learning for E-health with Horizontal and Vertical Data Partitioning
Chong Yu, Shuaiqi Shen, Shiqiang Wang, Kuan Zhang, Hai Zhao
TL;DR
This work tackles the challenge of learning from e-health data partitioned across horizontal and vertical axes by proposing a three-tier Hybrid Federated Learning framework that combines intermediate result exchange with local and global aggregations. The authors introduce Hybrid SGD (HSGD), provide convergence guarantees under standard smoothness and variance assumptions, and derive adaptive strategies to balance convergence quality with communication cost. Empirical results across multi-domain datasets demonstrate substantial reductions in training time and communication overhead while maintaining high accuracy, validating the effectiveness of the adaptive interval tuning and learning-rate adjustments. The framework offers a practical, privacy-preserving approach for collaborative medical AI among wearables, hospitals, and cloud infrastructure, with potential for secure integration and broader adoption in health IT.
Abstract
E-health allows smart devices and medical institutions to collaboratively collect patients' data, which is trained by Artificial Intelligence (AI) technologies to help doctors make diagnosis. By allowing multiple devices to train models collaboratively, federated learning is a promising solution to address the communication and privacy issues in e-health. However, applying federated learning in e-health faces many challenges. First, medical data is both horizontally and vertically partitioned. Since single Horizontal Federated Learning (HFL) or Vertical Federated Learning (VFL) techniques cannot deal with both types of data partitioning, directly applying them may consume excessive communication cost due to transmitting a part of raw data when requiring high modeling accuracy. Second, a naive combination of HFL and VFL has limitations including low training efficiency, unsound convergence analysis, and lack of parameter tuning strategies. In this paper, we provide a thorough study on an effective integration of HFL and VFL, to achieve communication efficiency and overcome the above limitations when data is both horizontally and vertically partitioned. Specifically, we propose a hybrid federated learning framework with one intermediate result exchange and two aggregation phases. Based on this framework, we develop a Hybrid Stochastic Gradient Descent (HSGD) algorithm to train models. Then, we theoretically analyze the convergence upper bound of the proposed algorithm. Using the convergence results, we design adaptive strategies to adjust the training parameters and shrink the size of transmitted data. Experimental results validate that the proposed HSGD algorithm can achieve the desired accuracy while reducing communication cost, and they also verify the effectiveness of the adaptive strategies.
