Exceedance Probability Forecasting via Regression for Significant Wave Height Prediction
Vitor Cerqueira, Luis Torgo
TL;DR
We address the challenge of predicting extreme significant wave heights by reframing it as exceedance probability forecasting. The authors introduce a novel, non-ensemble approach that converts point forecasts from a regression model into exceedance probabilities using the Weibull CDF with the forecast as the location parameter. The method is demonstrated on Halifax coast buoy data, showing that CDF-based probability estimates coupled with a neural network forecasting model outperform traditional classifiers and ensemble-based strategies. The approach provides both actionable exceedance probabilities and informative point forecasts, offering flexibility for threshold selection and decision-making in maritime operations. This work advances practical extreme-value forecasting in ocean settings and suggests broad applicability to other domains requiring efficient exceedance probability estimation from forecasts.
Abstract
Significant wave height forecasting is a key problem in ocean data analytics. This task affects several maritime operations, such as managing the passage of vessels or estimating the energy production from waves. In this work, we focus on the prediction of extreme values of significant wave height that can cause coastal disasters. This task is framed as an exceedance probability forecasting problem. Accordingly, we aim to estimate the probability that the significant wave height will exceed a predefined critical threshold. This problem is usually solved using a probabilistic binary classification model or an ensemble of forecasts. Instead, we propose a novel approach based on point forecasting. Computing both type of forecasts (binary probabilities and point forecasts) can be useful for decision-makers. While a probabilistic binary forecast streamlines information for end-users concerning exceedance events, the point forecasts can provide additional insights into the upcoming future dynamics. The procedure of the proposed solution works by assuming that the point forecasts follow a distribution with the location parameter equal to that forecast. Then, we convert these point forecasts into exceedance probability estimates using the cumulative distribution function. We carried out experiments using data from a smart buoy placed on the coast of Halifax, Canada. The results suggest that the proposed methodology is better than state-of-the-art approaches for exceedance probability forecasting.
