Table of Contents
Fetching ...

Reinforcement learning to maximise wind turbine energy generation

Daniel Soler, Oscar Mariño, David Huergo, Martín de Frutos, Esteban Ferrer

TL;DR

This work develops a reinforcement learning approach to maximize wind turbine energy by jointly steering yaw, pitch, and rotor speed using a DDQN with Prioritized Experience Replay integrated with a fast BEMT-based turbine model. The state space includes yaw misalignment, pitch, rotor speed, and wind conditions, while the action space provides discrete adjustments to the three control variables; rewards are designed to maximize the power coefficient $C_p$ with penalties for constraint violations. Results show DDQN1 and a Value Iteration baseline often outperform PID and uncontrolled strategies across diverse wind scenarios, with DDQN1 demonstrating strong adaptability to turbulent winds and unseen conditions, including real four-month wind data. The study reports high-level performance metrics such as Control Capacity Factor (CCF) and yearly energy production, demonstrating significant potential for RL-driven wind farm optimization and informing future hybrid and multi-agent control strategies for enhanced grid integration.

Abstract

We propose a reinforcement learning strategy to control wind turbine energy generation by actively changing the rotor speed, the rotor yaw angle and the blade pitch angle. A double deep Q-learning with a prioritized experience replay agent is coupled with a blade element momentum model and is trained to allow control for changing winds. The agent is trained to decide the best control (speed, yaw, pitch) for simple steady winds and is subsequently challenged with real dynamic turbulent winds, showing good performance. The double deep Q- learning is compared with a classic value iteration reinforcement learning control and both strategies outperform a classic PID control in all environments. Furthermore, the reinforcement learning approach is well suited to changing environments including turbulent/gusty winds, showing great adaptability. Finally, we compare all control strategies with real winds and compute the annual energy production. In this case, the double deep Q-learning algorithm also outperforms classic methodologies.

Reinforcement learning to maximise wind turbine energy generation

TL;DR

This work develops a reinforcement learning approach to maximize wind turbine energy by jointly steering yaw, pitch, and rotor speed using a DDQN with Prioritized Experience Replay integrated with a fast BEMT-based turbine model. The state space includes yaw misalignment, pitch, rotor speed, and wind conditions, while the action space provides discrete adjustments to the three control variables; rewards are designed to maximize the power coefficient with penalties for constraint violations. Results show DDQN1 and a Value Iteration baseline often outperform PID and uncontrolled strategies across diverse wind scenarios, with DDQN1 demonstrating strong adaptability to turbulent winds and unseen conditions, including real four-month wind data. The study reports high-level performance metrics such as Control Capacity Factor (CCF) and yearly energy production, demonstrating significant potential for RL-driven wind farm optimization and informing future hybrid and multi-agent control strategies for enhanced grid integration.

Abstract

We propose a reinforcement learning strategy to control wind turbine energy generation by actively changing the rotor speed, the rotor yaw angle and the blade pitch angle. A double deep Q-learning with a prioritized experience replay agent is coupled with a blade element momentum model and is trained to allow control for changing winds. The agent is trained to decide the best control (speed, yaw, pitch) for simple steady winds and is subsequently challenged with real dynamic turbulent winds, showing good performance. The double deep Q- learning is compared with a classic value iteration reinforcement learning control and both strategies outperform a classic PID control in all environments. Furthermore, the reinforcement learning approach is well suited to changing environments including turbulent/gusty winds, showing great adaptability. Finally, we compare all control strategies with real winds and compute the annual energy production. In this case, the double deep Q-learning algorithm also outperforms classic methodologies.
Paper Structure (15 sections, 4 equations, 13 figures, 3 tables, 1 algorithm)

This paper contains 15 sections, 4 equations, 13 figures, 3 tables, 1 algorithm.

Figures (13)

  • Figure 1: Flow diagram for the complete algorithm.
  • Figure 2: Neural network architecture.
  • Figure 3: Power coefficient $C_p$ for: (a) a fixed tip speed ratio (TSR) of 8 and (b) a fixed rotational speed.
  • Figure 4: Wind turbine training metrics.
  • Figure 5: Narrow and Wide wind environments used for validation.
  • ...and 8 more figures