Table of Contents
Fetching ...

Marginalize, Rather than Impute: Probabilistic Wind Power Forecasting with Incomplete Data

Honglin Wen, Pierre Pinson, Jie Gu, Zhijian Jin

TL;DR

Missing data in wind power forecasting can bias models and obscure uncertainty. The authors propose a flow-augmented VAE that learns the joint distribution of features and targets and marginalizes missing features during forecasting, avoiding imputation. Training uses an IWAE objective on observed data, and operational forecasts are generated via importance resampling to produce calibrated predictive scenarios. Empirical results on WIND Toolkit data show CRPS improvements and favorable calibration versus impute-then-predict baselines, with scalable training and efficient real-time forecasting.

Abstract

Machine learning methods are widely and successfully used for probabilistic wind power forecasting, yet the pervasive issue of missing values (e.g., due to sensor faults or communication outages) has received limited attention. The prevailing practice is impute-then-predict, but conditioning on point imputations biases parameter estimates and fails to propagate uncertainty from missing features. Our approach treats missing features and forecast targets uniformly: we learn a joint generative model of features and targets from incomplete data and, at operational deployment, condition on the observed features and marginalize the unobserved ones to produce forecasts. This imputation-free procedure avoids error introduced by imputation and preserves uncertainty aroused from missing features. In experiments, it improves forecast quality in terms of continuous ranked probability score relative to impute-then-predict baselines while incurring substantially lower computational cost than common alternatives.

Marginalize, Rather than Impute: Probabilistic Wind Power Forecasting with Incomplete Data

TL;DR

Missing data in wind power forecasting can bias models and obscure uncertainty. The authors propose a flow-augmented VAE that learns the joint distribution of features and targets and marginalizes missing features during forecasting, avoiding imputation. Training uses an IWAE objective on observed data, and operational forecasts are generated via importance resampling to produce calibrated predictive scenarios. Empirical results on WIND Toolkit data show CRPS improvements and favorable calibration versus impute-then-predict baselines, with scalable training and efficient real-time forecasting.

Abstract

Machine learning methods are widely and successfully used for probabilistic wind power forecasting, yet the pervasive issue of missing values (e.g., due to sensor faults or communication outages) has received limited attention. The prevailing practice is impute-then-predict, but conditioning on point imputations biases parameter estimates and fails to propagate uncertainty from missing features. Our approach treats missing features and forecast targets uniformly: we learn a joint generative model of features and targets from incomplete data and, at operational deployment, condition on the observed features and marginalize the unobserved ones to produce forecasts. This imputation-free procedure avoids error introduced by imputation and preserves uncertainty aroused from missing features. In experiments, it improves forecast quality in terms of continuous ranked probability score relative to impute-then-predict baselines while incurring substantially lower computational cost than common alternatives.
Paper Structure (26 sections, 1 theorem, 37 equations, 6 figures, 5 tables, 1 algorithm)

This paper contains 26 sections, 1 theorem, 37 equations, 6 figures, 5 tables, 1 algorithm.

Key Result

Theorem 1

Let $\boldsymbol{\gamma}$ such that for all $t$, $p(\mathbf{m}_t\mid\mathbf{z}_t;\boldsymbol{\gamma})>0$. Assuming data are MAR, $p(\mathbf{z}_t^o,\mathbf{m}_t;\boldsymbol{\theta},\boldsymbol{\gamma})$ is proportional to $p(\mathbf{z}_t^o,\mathbf{m}_t;\boldsymbol{\theta})$ w.r.t. $\boldsymbol{\theta

Figures (6)

  • Figure 1: Illustrative samples with missingness
  • Figure 2: Transition from conditional distribution modeling to joint distribution modeling, where gray blocks indicate observed values and white blocks indicate unobserved values.
  • Figure 3: Illustration of how complete samples are predicted by using the encoder and decoder models.
  • Figure 4: Sampling based on the decoder and encoder models for operational forecasting.
  • Figure 5: 4-day episode with 1-step ahead 90% prediction intervals (issued by our proposed approach), along with corresponding observations.
  • ...and 1 more figures

Theorems & Definitions (8)

  • Definition 1: MCAR
  • Definition 2: MAR
  • Definition 3: MNAR
  • Remark 1
  • Remark 2
  • Theorem 1: Ignorability under MAR rubin1976inference
  • proof
  • Remark 3