LiveTune: Dynamic Parameter Tuning for Feedback-Driven Optimization
Soheil Zibakhsh Shabgahi, Nojan Sheybani, Aiden Tabrizi, Farinaz Koushanfar
TL;DR
LiveTune enables real-time dynamic hyperparameter tuning during runtime through LiveVariables and LiveTriggers, addressing the inefficiencies of restart-based optimization in feedback-driven learning. The framework supports continuous training, integrates with major ML frameworks, and demonstrates energy and time savings, plus improved RL policy development via reward shaping in the Hungry Thirsty Domain and an in-person competition. Key contributions include the LiveVariables/LiveTriggers design, a central dictionary for coordinated updates, and an open-source API enabling unsupervised feedback-driven optimization. The work highlights practical impact for ML and RL workflows by reducing wasted compute and accelerating experimentation, with broader implications for energy-efficient autonomous systems.
Abstract
Feedback-driven optimization, such as traditional machine learning training, is a static process that lacks real-time adaptability of hyperparameters. Tuning solutions for optimization require trial and error paired with checkpointing and schedulers, in many cases feedback from the algorithm is overlooked. Adjusting hyperparameters during optimization usually requires the program to be restarted, wasting utilization and time, while placing unnecessary strain on memory and processors. We present LiveTune, a novel framework allowing real-time parameter adjustment of optimization loops through LiveVariables. Live Variables allow for continuous feedback-driven optimization by storing parameters on designated ports on the system, allowing them to be dynamically adjusted. Extensive evaluations of our framework on standard machine learning training pipelines show saving up to 60 seconds and 5.4 Kilojoules of energy per hyperparameter change. We also show the feasibility and value of LiveTune in a reinforcement learning application where the users change the dynamics of the reward structure while the agent is learning showing 5x improvement over the baseline. Finally, we outline a fully automated workflow to provide end-to-end, unsupervised feedback-driven optimization.
