Table of Contents
Fetching ...

Harvesting energy from turbulent winds with Reinforcement Learning

Lorenzo Basile, Maria Grazia Berni, Antonio Celani

TL;DR

The paper addresses maximizing energy from high-altitude turbulent winds using a pumping AWE system. It employs Twin Delayed Deep Deterministic Policy Gradient (TD3) to learn phase-specific, model-free control policies that operate on a minimal state consisting of the kite's attack angle $\alpha$, bank angle $\psi$, and relative wind angle $\beta$. The key contributions are (i) showing that RL can achieve energy extraction in turbulent flows with limited sensing, (ii) demonstrating that separate TD3 agents can optimize traction, retraction, and transition phases, and (iii) revealing that policies trained in turbulence generalize better to turbulent conditions and outperform constant-wind policies in energy per cycle. The findings suggest RL-based control as a robust, model-free approach for AWE and motivate the development of digital twins to bridge sim-to-real deployment.

Abstract

Airborne Wind Energy (AWE) is an emerging technology designed to harness the power of high-altitude winds, offering a solution to several limitations of conventional wind turbines. AWE is based on flying devices (usually gliders or kites) that, tethered to a ground station and driven by the wind, convert its mechanical energy into electrical energy by means of a generator. Such systems are usually controlled by manoeuvering the kite so as to follow a predefined path prescribed by optimal control techniques, such as model-predictive control. These methods are strongly dependent on the specific model at use and difficult to generalize, especially in unpredictable conditions such as the turbulent atmospheric boundary layer. Our aim is to explore the possibility of replacing these techniques with an approach based on Reinforcement Learning (RL). Unlike traditional methods, RL does not require a predefined model, making it robust to variability and uncertainty. Our experimental results in complex simulated environments demonstrate that AWE agents trained with RL can effectively extract energy from turbulent flows, relying on minimal local information about the kite orientation and speed relative to the wind.

Harvesting energy from turbulent winds with Reinforcement Learning

TL;DR

The paper addresses maximizing energy from high-altitude turbulent winds using a pumping AWE system. It employs Twin Delayed Deep Deterministic Policy Gradient (TD3) to learn phase-specific, model-free control policies that operate on a minimal state consisting of the kite's attack angle , bank angle , and relative wind angle . The key contributions are (i) showing that RL can achieve energy extraction in turbulent flows with limited sensing, (ii) demonstrating that separate TD3 agents can optimize traction, retraction, and transition phases, and (iii) revealing that policies trained in turbulence generalize better to turbulent conditions and outperform constant-wind policies in energy per cycle. The findings suggest RL-based control as a robust, model-free approach for AWE and motivate the development of digital twins to bridge sim-to-real deployment.

Abstract

Airborne Wind Energy (AWE) is an emerging technology designed to harness the power of high-altitude winds, offering a solution to several limitations of conventional wind turbines. AWE is based on flying devices (usually gliders or kites) that, tethered to a ground station and driven by the wind, convert its mechanical energy into electrical energy by means of a generator. Such systems are usually controlled by manoeuvering the kite so as to follow a predefined path prescribed by optimal control techniques, such as model-predictive control. These methods are strongly dependent on the specific model at use and difficult to generalize, especially in unpredictable conditions such as the turbulent atmospheric boundary layer. Our aim is to explore the possibility of replacing these techniques with an approach based on Reinforcement Learning (RL). Unlike traditional methods, RL does not require a predefined model, making it robust to variability and uncertainty. Our experimental results in complex simulated environments demonstrate that AWE agents trained with RL can effectively extract energy from turbulent flows, relying on minimal local information about the kite orientation and speed relative to the wind.

Paper Structure

This paper contains 19 sections, 34 equations, 7 figures, 5 tables.

Figures (7)

  • Figure 1: Schematic representation of a pumping Airborne Wind Energy system (image adapted from Ref. folkersma2019boundary)
  • Figure 2: (A) Schematic force diagram of the AWE system. (B) Side view of the kite, showcasing the state variables $\alpha$ (attack angle) and $\beta$ (relative wind speed angle). Controlling the attack angle allows the kite to soar and glide. (C) Rear view of the kite, depicting the bank angle $\psi$, which is used to turn the kite left and right.
  • Figure 3: Sample kite flight paths in the turbulent Couette flow.
  • Figure 4: (left) Control and state variables (attack angle $\alpha$, bank angle $\psi$ and relative wind speed angle $\beta$); (right) Power profile as functions of time during an operative cycle. Both plots refer to the cycle whose trajectory is displayed in \ref{['fig:traj']}-left panel.
  • Figure 5: (A) Overall trajectory of the kite, following the policy learned in the constant wind, in the turbulent wind pattern; (B) Learned policy; (C) Power profile compared with the one obtained by an agent trained in the turbulent flow.
  • ...and 2 more figures