Who to Trust? Aggregating Client Predictions in Federated Distillation

Viktor Kovalchuk; Denis Son; Arman Bolatov; Mohsen Guizani; Samuel Horváth; Maxim Panov; Martin Takáč; Eduard Gorbunov; Nikita Kotelevskii

Who to Trust? Aggregating Client Predictions in Federated Distillation

Viktor Kovalchuk, Denis Son, Arman Bolatov, Mohsen Guizani, Samuel Horváth, Maxim Panov, Martin Takáč, Eduard Gorbunov, Nikita Kotelevskii

Abstract

Under data heterogeneity (e.g., $\textit{class mismatch}$), clients may produce unreliable predictions for instances belonging to unfamiliar classes. An equally weighted combination of such predictions can corrupt the teacher signal used for distillation. In this paper, we provide a theoretical analysis of Federated Distillation and show that aggregating client predictions on a shared public dataset converges to a neighborhood of the optimum, where the neighborhood size is governed by the aggregation quality. We further propose two uncertainty-aware aggregation methods, $\mathbf{UWA}$ and $\mathbf{sUWA}$, which leverage density-based uncertainty estimates to down-weight unreliable client predictions. Experiments on image and text classification benchmarks demonstrate that our methods are particularly effective under high data heterogeneity, while matching standard averaging when heterogeneity is low.

Who to Trust? Aggregating Client Predictions in Federated Distillation

Abstract

Under data heterogeneity (e.g.,

), clients may produce unreliable predictions for instances belonging to unfamiliar classes. An equally weighted combination of such predictions can corrupt the teacher signal used for distillation. In this paper, we provide a theoretical analysis of Federated Distillation and show that aggregating client predictions on a shared public dataset converges to a neighborhood of the optimum, where the neighborhood size is governed by the aggregation quality. We further propose two uncertainty-aware aggregation methods,

and

, which leverage density-based uncertainty estimates to down-weight unreliable client predictions. Experiments on image and text classification benchmarks demonstrate that our methods are particularly effective under high data heterogeneity, while matching standard averaging when heterogeneity is low.

Who to Trust? Aggregating Client Predictions in Federated Distillation

Abstract

Who to Trust? Aggregating Client Predictions in Federated Distillation

Abstract

Paper Structure

Table of Contents

Key Result

Figures (7)

Theorems & Definitions (18)