FlowRL: Flow-Augmented Few-Shot Reinforcement Learning for Semi-Structured Sensor Data
Mohammad Pivezhandi, Abusayeed Saifullah
TL;DR
FlowRL tackles the challenge of few-shot reinforcement learning in DVFS-like environments by generating high-quality synthetic semi-structured sensor data using distribution-aware flow matching. It combines continuous normalizing flows with latent-space bootstrapping and feature-weighted conditioning to preserve critical correlations while expanding data support, leading to improved sample efficiency and faster Q-value convergence. In a DVFS case on the NVIDIA Jetson TX2, FlowRL achieves up to $35\%$ higher frame rates and more robust policy learning than model-based and model-free baselines, and the approach generalizes to robotics and smart grids. The work provides practical mechanisms to leverage flow-based data augmentation in data-scarce RL settings, offering a scalable path for online adaptation in resource-constrained systems.
Abstract
Reinforcement learning (RL) in few-shot scenarios with limited sensor data is challenging due to insufficient training samples, particularly in applications like Dynamic Voltage and Frequency Scaling (DVFS) where sensor readings are semi-structured with inherent correlations. We propose Flow-Augmented Reinforcement Learning (FlowRL), a novel method that leverages continuous normalizing flows to generate high-quality synthetic data for few-shot RL. By integrating latent space bootstrapping for diversity and feature-weighted flow matching to preserve critical data correlations, FlowRL enhances sample efficiency and policy robustness. Evaluated on a DVFS case study using the NVIDIA Jetson TX2, our approach achieves up to 35\% higher frame rates and faster Q-value convergence compared to baselines, demonstrating its effectiveness in resource-constrained environments. FlowRL generalizes to other semi-structured domains, such as robotics and smart grids, offering a scalable solution for data-scarce RL settings.
