Federated Progressive Self-Distillation with Logits Calibration for Personalized IIoT Edge Intelligence
Yingchao Wang, Wenqi Niu
TL;DR
The paper addresses the dual forgetting challenge in personalized federated learning for non-IID IIoT edge data, where global knowledge forgets and historical personalized knowledge degrade during local updates. It introduces FedPSD, a client-side framework that couples logits calibration with progressive self-distillation to gradually preserve global generalization while recalling historical personalized knowledge; the approach uses a calibrated fusion label $H_k^{t-1} = \alpha P_k^{t-1} + (1-\alpha) Y_k$ with $\alpha = t/t_{\text{total}}$ and a KL-based distillation loss $\mathcal{L}_{KD} = \mathrm{KL}(H_k^{t,e-1} \| P_k^{t,e})$, together with a calibrated cross-entropy loss $\mathcal{L}_{CE}$ derived from $P_{\text{calibrated}}(y|x)$ that accounts for class priors $P(y)$. Experiments on MNIST, CIFAR-10, and CIFAR-100 under pathological sharding and Dirichlet partitions show FedPSD consistently improves both client and server accuracy and reduces communication rounds to reach target performance, with ablations confirming the contribution of each component. The results indicate FedPSD is a practical, low-overhead solution for robust, personalized IIoT edge intelligence in real-world non-IID FL settings. Overall, the method advances personalized FL by effectively integrating global and local knowledge on the client side, and it demonstrates strong potential for deployment in resource-constrained IIoT environments.
Abstract
Personalized Federated Learning (PFL) focuses on tailoring models to individual IIoT clients in federated learning by addressing data heterogeneity and diverse user needs. Although existing studies have proposed effective PFL solutions from various perspectives, they overlook the issue of forgetting both historical personalized knowledge and global generalized knowledge during local training on clients. Therefore, this study proposes a novel PFL method, Federated Progressive Self-Distillation (FedPSD), based on logits calibration and progressive self-distillation. We analyze the impact mechanism of client data distribution characteristics on personalized and global knowledge forgetting. To address the issue of global knowledge forgetting, we propose a logits calibration approach for the local training loss and design a progressive self-distillation strategy to facilitate the gradual inheritance of global knowledge, where the model outputs from the previous epoch serve as virtual teachers to guide the training of subsequent epochs. Moreover, to address personalized knowledge forgetting, we construct calibrated fusion labels by integrating historical personalized model outputs, which are then used as teacher model outputs to guide the initial epoch of local self-distillation, enabling rapid recall of personalized knowledge. Extensive experiments under various data heterogeneity scenarios demonstrate the effectiveness and superiority of the proposed FedPSD method.
