Overpredictive Signal Analytics in Federated Learning: Algorithms and Analysis
Vijay Anavangot
TL;DR
This work addresses privacy and communication constraints in federated learning by introducing overpredictive signal analytics that compute envelope approximations $\widehat{f}_{i,\mathrm{env}}(t)$ at edge devices using a convex optimization framework based on a Fourier-basis representation. The server aggregates these envelopes to produce analytics such as the aggregate signal $\widehat{s}(t)$ and the empirical CDF $\widehat{F}_N(x)$ via Glivenko–Cantelli theory, without accessing raw data. The authors derive bounds on the envelope approximation error and the resulting CDF estimation error, revealing tradeoffs between the bandwidth parameter $L$ and accuracy, including order-optimality results for ${\cal L}_1$ and ${\cal L}_2$ envelopes and subsampling effects. Experiments on a publicly available residential energy dataset validate the theoretical bounds and demonstrate practical applicability for energy-demand analytics in smart-city contexts.
Abstract
Edge signal processing facilitates distributed learning and inference in the client-server model proposed in federated learning. In traditional machine learning, clients (IoT devices) that acquire raw signal samples can aid a data center (server) learn a global signal model by pooling these distributed samples at a third-party location. Despite the promising capabilities of IoTs, these distributed deployments often face the challenge of sensitive private data and communication rate constraints. This necessitates a learning approach that communicates a processed approximation of the distributed samples instead of the raw signals. Such a decentralized learning approach using signal approximations will be termed distributed signal analytics in this work. Overpredictive signal approximations may be desired for distributed signal analytics, especially in network demand (capacity) planning applications motivated by federated learning. In this work, we propose algorithms that compute an overpredictive signal approximation at the client devices using an efficient convex optimization framework. Tradeoffs between communication cost, sampling rate, and the signal approximation error are quantified using mathematical analysis. We also show the performance of the proposed distributed algorithms on a publicly available residential energy consumption dataset.
