Table of Contents
Fetching ...

RAG-based User Profiling for Precision Planning in Mixed-precision Over-the-Air Federated Learning

Jinsheng Yuan, Yun Tang, Weisi Guo

TL;DR

This work tackles per-client precision planning in Mixed-Precision Over-the-Air Federated Learning (MP-OTA-FL) by introducing a Retrieval-Augmented Generation (RAG) based user profiling framework. It combines a chat-driven frontend, an LLM-powered backend, and a RAG knowledge base to model contextual factors and historical feedback, enabling per-client precision choices that balance user satisfaction and contribution to the global model. The framework uses a reward-penalty formulation to select the optimal quantization level $q^*$ that maximizes the Satisfaction Score, and demonstrates improvements in user satisfaction (~0.66 vs 0.60) and energy savings (~20%), along with enhanced accuracy for both minority and majority classes in a federated voice assistant task. The approach is validated via a 100-client federated setup with DeepSpeech2 on Common Voice data, and the authors provide an open-source framework to facilitate adoption in human-centered FL settings.

Abstract

Mixed-precision computing, a widely applied technique in AI, offers a larger trade-off space between accuracy and efficiency. The recent purposed Mixed-Precision Over-the-Air Federated Learning (MP-OTA-FL) enables clients to operate at appropriate precision levels based on their heterogeneous hardware, taking advantages of the larger trade-off space while covering the quantization overheads in the mixed-precision modulation scheme for the OTA aggregation process. A key to further exploring the potential of the MP-OTA-FL framework is the optimization of client precision levels. The choice of precision level hinges on multifaceted factors including hardware capability, potential client contribution, and user satisfaction, among which factors can be difficult to define or quantify. In this paper, we propose a RAG-based User Profiling for precision planning framework that integrates retrieval-augmented LLMs and dynamic client profiling to optimize satisfaction and contributions. This includes a hybrid interface for gathering device/user insights and an RAG database storing historical quantization decisions with feedback. Experiments show that our method boosts satisfaction, energy savings, and global model accuracy in MP-OTA-FL systems.

RAG-based User Profiling for Precision Planning in Mixed-precision Over-the-Air Federated Learning

TL;DR

This work tackles per-client precision planning in Mixed-Precision Over-the-Air Federated Learning (MP-OTA-FL) by introducing a Retrieval-Augmented Generation (RAG) based user profiling framework. It combines a chat-driven frontend, an LLM-powered backend, and a RAG knowledge base to model contextual factors and historical feedback, enabling per-client precision choices that balance user satisfaction and contribution to the global model. The framework uses a reward-penalty formulation to select the optimal quantization level that maximizes the Satisfaction Score, and demonstrates improvements in user satisfaction (~0.66 vs 0.60) and energy savings (~20%), along with enhanced accuracy for both minority and majority classes in a federated voice assistant task. The approach is validated via a 100-client federated setup with DeepSpeech2 on Common Voice data, and the authors provide an open-source framework to facilitate adoption in human-centered FL settings.

Abstract

Mixed-precision computing, a widely applied technique in AI, offers a larger trade-off space between accuracy and efficiency. The recent purposed Mixed-Precision Over-the-Air Federated Learning (MP-OTA-FL) enables clients to operate at appropriate precision levels based on their heterogeneous hardware, taking advantages of the larger trade-off space while covering the quantization overheads in the mixed-precision modulation scheme for the OTA aggregation process. A key to further exploring the potential of the MP-OTA-FL framework is the optimization of client precision levels. The choice of precision level hinges on multifaceted factors including hardware capability, potential client contribution, and user satisfaction, among which factors can be difficult to define or quantify. In this paper, we propose a RAG-based User Profiling for precision planning framework that integrates retrieval-augmented LLMs and dynamic client profiling to optimize satisfaction and contributions. This includes a hybrid interface for gathering device/user insights and an RAG database storing historical quantization decisions with feedback. Experiments show that our method boosts satisfaction, energy savings, and global model accuracy in MP-OTA-FL systems.

Paper Structure

This paper contains 17 sections, 4 equations, 4 figures, 2 tables.

Figures (4)

  • Figure 1: User satisfaction and client contribution potentials in federated learning vary with contextual factors such as usage patterns and operational environment.
  • Figure 2: User-in-the-loop Quantization Planning Framework Overview. The aim is to collect the user's feedback on the $T$ round and select the optimal quantization level for the $T+1$ round for the federated learning process.
  • Figure 3: Distribution of User Satisfaction Scores and Relative Energy Cost. Compared to planning precision levels with unified standards, personalized standards can achieve 10% higher average satisfaction score, and 20% of energy cost. When prioritise the federated system towards energy savings, 22% satisfaction score can be traded for a total of 28% energy saving.
  • Figure 4: Word accuracy of the global model after 100 communication rounds by classes with different strategies. Compared to the default strategy, with b) class equal strategy, biased towards minority classes, our framework trades 2% accuracy of the majorities for 5% of that of the minorities; while with c) majority centric strategy, our framework extended the accuracies of majority classes by 4% with 3% lower accuracies for minority classes.