Integrating Physics and Data-Driven Approaches: An Explainable and Uncertainty-Aware Hybrid Model for Wind Turbine Power Prediction
Alfonso Gijón, Simone Eiraudo, Antonio Manjavacas, Daniele Salvatore Schiera, Miguel Molina-Solana, Juan Gómez-Romero
TL;DR
This work advances wind turbine power prediction by integrating a physics-based submodel with a data-driven residual component in an additive semi-parametric framework. The physics term uses a neural network to model the power coefficient $C_p$ under the Betz limit, while the residual term leverages additional variables to capture unmodeled effects, trained in two stages. The approach achieves substantial accuracy gains over a purely physics-based model, and its interpretability is enhanced with SHAP analyses, complemented by uncertainty quantification via conformalized quantile regression. The resulting framework provides a flexible, reliable tool for power forecasting, anomaly detection, and potential optimization in wind energy systems, with practical implications for operational decision-making and maintenance planning.
Abstract
The rapid growth of the wind energy sector underscores the urgent need to optimize turbine operations and ensure effective maintenance through early fault detection systems. While traditional empirical and physics-based models offer approximate predictions of power generation based on wind speed, they often fail to capture the complex, non-linear relationships between other input variables and the resulting power output. Data-driven machine learning methods present a promising avenue for improving wind turbine modeling by leveraging large datasets, enhancing prediction accuracy but often at the cost of interpretability. In this study, we propose a hybrid semi-parametric model that combines the strengths of both approaches, applied to a dataset from a wind farm with four turbines. The model integrates a physics-inspired submodel, providing a reasonable approximation of power generation, with a non-parametric submodel that predicts the residuals. This non-parametric submodel is trained on a broader range of variables to account for phenomena not captured by the physics-based component. The hybrid model achieves a 37% improvement in prediction accuracy over the physics-based model. To enhance interpretability, SHAP values are used to analyze the influence of input features on the residual submodel's output. Additionally, prediction uncertainties are quantified using a conformalized quantile regression method. The combination of these techniques, alongside the physics grounding of the parametric submodel, provides a flexible, accurate, and reliable framework. Ultimately, this study opens the door for evaluating the impact of unmodeled variables on wind turbine power generation, offering a basis for potential optimization.
