RAG-based User Profiling for Precision Planning in Mixed-precision Over-the-Air Federated Learning
Jinsheng Yuan, Yun Tang, Weisi Guo
TL;DR
This work tackles per-client precision planning in Mixed-Precision Over-the-Air Federated Learning (MP-OTA-FL) by introducing a Retrieval-Augmented Generation (RAG) based user profiling framework. It combines a chat-driven frontend, an LLM-powered backend, and a RAG knowledge base to model contextual factors and historical feedback, enabling per-client precision choices that balance user satisfaction and contribution to the global model. The framework uses a reward-penalty formulation to select the optimal quantization level $q^*$ that maximizes the Satisfaction Score, and demonstrates improvements in user satisfaction (~0.66 vs 0.60) and energy savings (~20%), along with enhanced accuracy for both minority and majority classes in a federated voice assistant task. The approach is validated via a 100-client federated setup with DeepSpeech2 on Common Voice data, and the authors provide an open-source framework to facilitate adoption in human-centered FL settings.
Abstract
Mixed-precision computing, a widely applied technique in AI, offers a larger trade-off space between accuracy and efficiency. The recent purposed Mixed-Precision Over-the-Air Federated Learning (MP-OTA-FL) enables clients to operate at appropriate precision levels based on their heterogeneous hardware, taking advantages of the larger trade-off space while covering the quantization overheads in the mixed-precision modulation scheme for the OTA aggregation process. A key to further exploring the potential of the MP-OTA-FL framework is the optimization of client precision levels. The choice of precision level hinges on multifaceted factors including hardware capability, potential client contribution, and user satisfaction, among which factors can be difficult to define or quantify. In this paper, we propose a RAG-based User Profiling for precision planning framework that integrates retrieval-augmented LLMs and dynamic client profiling to optimize satisfaction and contributions. This includes a hybrid interface for gathering device/user insights and an RAG database storing historical quantization decisions with feedback. Experiments show that our method boosts satisfaction, energy savings, and global model accuracy in MP-OTA-FL systems.
