Harvesting energy from turbulent winds with Reinforcement Learning
Lorenzo Basile, Maria Grazia Berni, Antonio Celani
TL;DR
The paper addresses maximizing energy from high-altitude turbulent winds using a pumping AWE system. It employs Twin Delayed Deep Deterministic Policy Gradient (TD3) to learn phase-specific, model-free control policies that operate on a minimal state consisting of the kite's attack angle $\alpha$, bank angle $\psi$, and relative wind angle $\beta$. The key contributions are (i) showing that RL can achieve energy extraction in turbulent flows with limited sensing, (ii) demonstrating that separate TD3 agents can optimize traction, retraction, and transition phases, and (iii) revealing that policies trained in turbulence generalize better to turbulent conditions and outperform constant-wind policies in energy per cycle. The findings suggest RL-based control as a robust, model-free approach for AWE and motivate the development of digital twins to bridge sim-to-real deployment.
Abstract
Airborne Wind Energy (AWE) is an emerging technology designed to harness the power of high-altitude winds, offering a solution to several limitations of conventional wind turbines. AWE is based on flying devices (usually gliders or kites) that, tethered to a ground station and driven by the wind, convert its mechanical energy into electrical energy by means of a generator. Such systems are usually controlled by manoeuvering the kite so as to follow a predefined path prescribed by optimal control techniques, such as model-predictive control. These methods are strongly dependent on the specific model at use and difficult to generalize, especially in unpredictable conditions such as the turbulent atmospheric boundary layer. Our aim is to explore the possibility of replacing these techniques with an approach based on Reinforcement Learning (RL). Unlike traditional methods, RL does not require a predefined model, making it robust to variability and uncertainty. Our experimental results in complex simulated environments demonstrate that AWE agents trained with RL can effectively extract energy from turbulent flows, relying on minimal local information about the kite orientation and speed relative to the wind.
