Table of Contents
Fetching ...

Forget to Generalize: Iterative Adaptation for Generalization in Federated Learning

Abdulrahman Alotaibi, Irene Tenison, Miriam Kim, Isaac Lee, Lalana Kagal

TL;DR

This work tackles poor generalization in non-IID federated learning by introducing Iterative Federated Adaptation (IFA), which partitions training into $G$ generations each with $C$ rounds and $E$ local epochs, and periodically resets a fraction $\rho$ of parameters to forget client-specific biases. Two reset strategies are proposed—Random Parameter Selection and Later Layer Selection—providing a mechanism to re-learn generalizable representations while preserving useful prior knowledge. Across CIFAR-10, MIT Indoors, and Stanford Dogs, IFA yields substantial improvements, averaging $21.5\%$ gains and proving robust across IID/Non-IID settings and varying client counts, while remaining agnostic to the underlying aggregation method. The approach, inspired by continual learning, offers a practical, plug-and-play enhancement for privacy-preserving, scalable web-scale FL systems. Future work includes developing adaptive reset schedules and providing theoretical analyses of generalization-accuracy trade-offs in large-scale federated deployments.

Abstract

The Web is naturally heterogeneous with user devices, geographic regions, browsing patterns, and contexts all leading to highly diverse, unique datasets. Federated Learning (FL) is an important paradigm for the Web because it enables privacy-preserving, collaborative machine learning across diverse user devices, web services and clients without needing to centralize sensitive data. However, its performance degrades severely under non-IID client distributions that is prevalent in real-world web systems. In this work, we propose a new training paradigm - Iterative Federated Adaptation (IFA) - that enhances generalization in heterogeneous federated settings through generation-wise forget and evolve strategy. Specifically, we divide training into multiple generations and, at the end of each, select a fraction of model parameters (a) randomly or (b) from the later layers of the model and reinitialize them. This iterative forget and evolve schedule allows the model to escape local minima and preserve globally relevant representations. Extensive experiments on CIFAR-10, MIT-Indoors, and Stanford Dogs datasets show that the proposed approach improves global accuracy, especially when the data cross clients are Non-IID. This method can be implemented on top any federated algorithm to improve its generalization performance. We observe an average of 21.5%improvement across datasets. This work advances the vision of scalable, privacy-preserving intelligence for real-world heterogeneous and distributed web systems.

Forget to Generalize: Iterative Adaptation for Generalization in Federated Learning

TL;DR

This work tackles poor generalization in non-IID federated learning by introducing Iterative Federated Adaptation (IFA), which partitions training into generations each with rounds and local epochs, and periodically resets a fraction of parameters to forget client-specific biases. Two reset strategies are proposed—Random Parameter Selection and Later Layer Selection—providing a mechanism to re-learn generalizable representations while preserving useful prior knowledge. Across CIFAR-10, MIT Indoors, and Stanford Dogs, IFA yields substantial improvements, averaging gains and proving robust across IID/Non-IID settings and varying client counts, while remaining agnostic to the underlying aggregation method. The approach, inspired by continual learning, offers a practical, plug-and-play enhancement for privacy-preserving, scalable web-scale FL systems. Future work includes developing adaptive reset schedules and providing theoretical analyses of generalization-accuracy trade-offs in large-scale federated deployments.

Abstract

The Web is naturally heterogeneous with user devices, geographic regions, browsing patterns, and contexts all leading to highly diverse, unique datasets. Federated Learning (FL) is an important paradigm for the Web because it enables privacy-preserving, collaborative machine learning across diverse user devices, web services and clients without needing to centralize sensitive data. However, its performance degrades severely under non-IID client distributions that is prevalent in real-world web systems. In this work, we propose a new training paradigm - Iterative Federated Adaptation (IFA) - that enhances generalization in heterogeneous federated settings through generation-wise forget and evolve strategy. Specifically, we divide training into multiple generations and, at the end of each, select a fraction of model parameters (a) randomly or (b) from the later layers of the model and reinitialize them. This iterative forget and evolve schedule allows the model to escape local minima and preserve globally relevant representations. Extensive experiments on CIFAR-10, MIT-Indoors, and Stanford Dogs datasets show that the proposed approach improves global accuracy, especially when the data cross clients are Non-IID. This method can be implemented on top any federated algorithm to improve its generalization performance. We observe an average of 21.5%improvement across datasets. This work advances the vision of scalable, privacy-preserving intelligence for real-world heterogeneous and distributed web systems.
Paper Structure (17 sections, 2 equations, 1 figure, 3 tables, 1 algorithm)