Table of Contents
Fetching ...

MVG-CRPS: A Robust Loss Function for Multivariate Probabilistic Forecasting

Vincent Zhihao Zheng, Lijun Sun

TL;DR

Experiments on real-world datasets across multivariate autoregressive and univariate sequence-to-sequence forecasting tasks show that MVG-CRPS improves robustness, accuracy, and uncertainty quantification in probabilistic forecasting.

Abstract

Multivariate Gaussian (MVG) distributions are central to modeling correlated continuous variables in probabilistic forecasting. Neural forecasting models typically parameterize the mean vector and covariance matrix of the distribution using neural networks, optimizing with the log-score (negative log-likelihood) as the loss function. However, the sensitivity of the log-score to outliers can lead to significant errors in the presence of anomalies. Drawing on the continuous ranked probability score (CRPS) for univariate distributions, we propose MVG-CRPS, a strictly proper scoring rule for MVG distributions. MVG-CRPS admits a closed-form expression in terms of neural network outputs, thereby integrating seamlessly into deep learning frameworks. Experiments on real-world datasets across multivariate autoregressive and univariate sequence-to-sequence (Seq2Seq) forecasting tasks show that MVG-CRPS improves robustness, accuracy, and uncertainty quantification in probabilistic forecasting.

MVG-CRPS: A Robust Loss Function for Multivariate Probabilistic Forecasting

TL;DR

Experiments on real-world datasets across multivariate autoregressive and univariate sequence-to-sequence forecasting tasks show that MVG-CRPS improves robustness, accuracy, and uncertainty quantification in probabilistic forecasting.

Abstract

Multivariate Gaussian (MVG) distributions are central to modeling correlated continuous variables in probabilistic forecasting. Neural forecasting models typically parameterize the mean vector and covariance matrix of the distribution using neural networks, optimizing with the log-score (negative log-likelihood) as the loss function. However, the sensitivity of the log-score to outliers can lead to significant errors in the presence of anomalies. Drawing on the continuous ranked probability score (CRPS) for univariate distributions, we propose MVG-CRPS, a strictly proper scoring rule for MVG distributions. MVG-CRPS admits a closed-form expression in terms of neural network outputs, thereby integrating seamlessly into deep learning frameworks. Experiments on real-world datasets across multivariate autoregressive and univariate sequence-to-sequence (Seq2Seq) forecasting tasks show that MVG-CRPS improves robustness, accuracy, and uncertainty quantification in probabilistic forecasting.

Paper Structure

This paper contains 25 sections, 1 theorem, 25 equations, 7 figures, 7 tables.

Key Result

Theorem 1.1

Let $\mathbf{z}\sim \mathcal{N}\left(\boldsymbol{\mu}_p, \boldsymbol{\Sigma}_p\right)$ be a true $N$-variate Gaussian distribution where the covariance admits eigen-decomposition $\boldsymbol{\Sigma}_p=\boldsymbol{U}_p\boldsymbol{S}_p \boldsymbol{U}_p^{\top}$, with $\boldsymbol{S}_p=\operatorname{di is proper and strictly proper for multivariate Gaussian distributions.

Figures (7)

  • Figure 1: A motivating example illustrating how MVG-CRPS improves predictive performance by limiting outlier influence and reducing training time through its sampling-free design. The energy score is computed using sample sizes of 50, 100, and 150.
  • Figure 2: Illustration of the multivariate autoregressive and univariate Seq2Seq forecasting tasks.
  • Figure 3: Sensitivity of scoring rules to parameter deviations in the predicted mean, standard deviation, and correlation coefficient from the true data distribution ($\mu_{\text{true}} = 1, \sigma_{\text{true}} = 1, \rho_{\text{true}} = 0.4)$. The energy score is computed with a sample size of 500.
  • Figure 4: Comparison of output covariance matrices $\boldsymbol{\Sigma}_{t}$ from GPVar on the $\mathtt{elec\_weekly}$ dataset. The top and bottom rows display covariance matrices from models trained using the log-score and MVG-CRPS. For visual clarity, covariance values are clipped between 0 and 0.6.
  • Figure 5: Comparison of probabilistic forecasts from GPVar on the $\mathtt{electricity}$ dataset. The first and second rows display forecasts from models trained using the log-score and MVG-CRPS.
  • ...and 2 more figures

Theorems & Definitions (2)

  • Theorem 1.1
  • proof