CAFL-L: Constraint-Aware Federated Learning with Lagrangian Dual Optimization for On-Device Language Models
Dongqi Zheng, Wenjin Fu
TL;DR
CAFL-L addresses the challenge of training language models with federated learning on resource-constrained edge devices by introducing a Lagrangian dual optimization framework that enforces multi-resource budgets (energy, bandwidth, memory, and temperature). The method dynamically adapts training hyperparameters via a dual-variable policy and preserves training stability with a token-budget mechanism, ensuring stable progress under budget constraints. Empirical results on a Tiny Shakespeare character-level model show substantial gains in constraint satisfaction (roughly 70% energy reduction, 95% communication savings, and 23% memory reduction) with only a modest increase in validation loss, demonstrating practical viability for edge deployment. This work advances federated learning toward real-world, resource-aware on-device language modeling by jointly handling multiple device budgets without severely sacrificing accuracy.
Abstract
We introduce Constraint-Aware Federated Learning with Lagrangian Dual Optimization (CAFL-L), a principled extension of FedAvg that explicitly incorporates device-level resource constraints including energy, communication, memory, and thermal budgets. CAFL-L employs Lagrangian dual optimization to dynamically adapt training hyperparameters -- freezing depth, local steps, batch size, and communication compression -- while preserving training stability through token-budget preservation via gradient accumulation. Experiments on a character-level language model demonstrate that CAFL-L achieves superior constraint satisfaction compared to standard FedAvg (reducing memory usage by 20% and communication by 95%) while maintaining competitive validation performance, making it practical for deployment on resource-constrained edge devices.
