Scalable Asynchronous Federated Modeling for Spatial Data
Jianwei Shi, Sameh Abdulah, Ying Sun, Marc G. Genton
TL;DR
This work develops a scalable asynchronous federated modeling framework for spatial data using a knots-based low-rank Gaussian process. It introduces block-wise updates with local gradient correction, staleness-aware adaptive aggregation, and moving-average stabilization, and proves linear convergence with explicit dependence on staleness. Empirical results show that the low-rank model better preserves cross-worker spatial structure than an independence model, and that asynchronous updates outperform synchronous ones in heterogeneous settings while staying competitive when resources are balanced. The approach offers robust, privacy-preserving, and scalable spatial inference suitable for distributed environmental, urban, and public-health applications.
Abstract
Spatial data are central to applications such as environmental monitoring and urban planning, but are often distributed across devices where privacy and communication constraints limit direct sharing. Federated modeling offers a practical solution that preserves data privacy while enabling global modeling across distributed data sources. For instance, environmental sensor networks are privacy- and bandwidth-constrained, motivating federated spatial modeling that shares only privacy-preserving summaries to produce timely, high-resolution pollution maps without centralizing raw data. However, existing federated modeling approaches either ignore spatial dependence or rely on synchronous updates that suffer from stragglers in heterogeneous environments. This work proposes an asynchronous federated modeling framework for spatial data based on low-rank Gaussian process approximations. The method employs block-wise optimization and introduces strategies for gradient correction, adaptive aggregation, and stabilized updates. We establish linear convergence with explicit dependence on staleness, a result of standalone theoretical significance. Moreover, numerical experiments demonstrate that the asynchronous algorithm achieves synchronous performance under balanced resource allocation and significantly outperforms it in heterogeneous settings, showcasing superior robustness and scalability.
