Federated Knowledge Transfer Fine-tuning Large Server Model with Resource-Constrained IoT Clients
Shaoyuan Chen, Linlin You, Rui Liu, Shuo Yu, Ahmed M. Abdelmoniem
TL;DR
KOALA addresses the privacy and resource limitations of fine-tuning large models in IoT by fusing federated learning with bidirectional knowledge distillation between a server-side large model and distributed small models. The method supports homogeneous and heterogeneous small-model modes, enabling reverse distillation to refine the large model and forward distillation to update small models, all while using a proxy dataset and preserving data privacy. The key contributions include a novel large-small collaborative learning process, a reverse distillation strategy for heterogeneous outputs, and demonstrated resource efficiency with near-baseline performance on standard benchmarks. These findings indicate KOALA can enable practical, privacy-preserving large-model adaptation in IoT environments, dramatically reducing local storage and computation requirements ($97.2$–$97.6\%$ storage reduction and $98.4$–$98.6\%$ FLOP reduction) while maintaining competitive accuracy.
Abstract
The training of large models, involving fine-tuning, faces the scarcity of high-quality data. Compared to the solutions based on centralized data centers, updating large models in the Internet of Things (IoT) faces challenges in coordinating knowledge from distributed clients by using their private and heterogeneous data. To tackle such a challenge, we propose KOALA (Federated Knowledge Transfer Fine-tuning Large Server Model with Resource-Constrained IoT Clients) to impel the training of large models in IoT. Since the resources obtained by IoT clients are limited and restricted, it is infeasible to locally execute large models and also update them in a privacy-preserving manner. Therefore, we leverage federated learning and knowledge distillation to update large models through collaboration with their small models, which can run locally at IoT clients to process their private data separately and enable large-small model knowledge transfer through iterative learning between the server and clients. Moreover, to support clients with similar or different computing capacities, KOALA is designed with two kinds of large-small model joint learning modes, namely to be homogeneous or heterogeneous. Experimental results demonstrate that compared to the conventional approach, our method can not only achieve similar training performance but also significantly reduce the need for local storage and computing power resources.
