Scalable and Resource-Efficient Second-Order Federated Learning via Over-the-Air Aggregation
Abdulmomen Ghalkha, Chaouki Ben Issaid, Mehdi Bennis
TL;DR
This work tackles the high computation and communication costs of second-order federated learning by introducing Fed-Sophia, a diagonal Gauss-Newton-Bartlett Hessian-based approach with EMA and stability clipping that avoids full Hessian storage. It further extends to Analog Over-The-Air Fed-Sophia, which exploits wireless channel superposition to aggregate updates over the air, transmitting only CSI-filtered entries under a power-constrained framework and with channel-aware scheduling. The combination yields scalable, energy-efficient second-order FL capable of handling large models, as demonstrated by substantial reductions in communication uploads, latency, and energy consumption across MNIST, Sent140, CIFAR-10, and CIFAR-100 while maintaining or improving accuracy, and showing robustness to non-IID data. The results indicate that OTA aggregation can make second-order FL practical for large-scale deployments, offering strong gains in both efficiency and scalability with privacy-preserving updates.
Abstract
Second-order federated learning (FL) algorithms offer faster convergence than their first-order counterparts by leveraging curvature information. However, they are hindered by high computational and storage costs, particularly for large-scale models. Furthermore, the communication overhead associated with large models and digital transmission exacerbates these challenges, causing communication bottlenecks. In this work, we propose a scalable second-order FL algorithm using a sparse Hessian estimate and leveraging over-the-air aggregation, making it feasible for larger models. Our simulation results demonstrate more than $67\%$ of communication resources and energy savings compared to other first and second-order baselines.
