Table of Contents
Fetching ...

Beyond MSE: Ordinal Cross-Entropy for Probabilistic Time Series Forecasting

Jieting Wang, Huimei Shi, Feijiang Li, Xiaolei Shang

TL;DR

This work tackles the limitations of MSE-based time-series forecasting by introducing OCE-TS, which reframes regression as an ordinal probability problem to enable uncertainty quantification. It leverages Target-to-Probability Transformation, a deep ordinal classifier built on a DLinear backbone, and Probability-to-Value Reconstruction to deliver both accurate point forecasts and calibrated distributions. Through influence-function analysis and extensive experiments on seven public datasets, OCE-TS demonstrates superior accuracy, robustness to noise, and stable performance across diverse settings, outperforming strong baselines. The approach offers a practical, scalable path toward probabilistic forecasting with inherent ordinal structure preservation.

Abstract

Time series forecasting is an important task that involves analyzing temporal dependencies and underlying patterns (such as trends, cyclicality, and seasonality) in historical data to predict future values or trends. Current deep learning-based forecasting models primarily employ Mean Squared Error (MSE) loss functions for regression modeling. Despite enabling direct value prediction, this method offers no uncertainty estimation and exhibits poor outlier robustness. To address these limitations, we propose OCE-TS, a novel ordinal classification approach for time series forecasting that replaces MSE with Ordinal Cross-Entropy (OCE) loss, preserving prediction order while quantifying uncertainty through probability output. Specifically, OCE-TS begins by discretizing observed values into ordered intervals and deriving their probabilities via a parametric distribution as supervision signals. Using a simple linear model, we then predict probability distributions for each timestep. The OCE loss is computed between the cumulative distributions of predicted and ground-truth probabilities, explicitly preserving ordinal relationships among forecasted values. Through theoretical analysis using influence functions, we establish that cross-entropy (CE) loss exhibits superior stability and outlier robustness compared to MSE loss. Empirically, we compared OCE-TS with five baseline models-Autoformer, DLinear, iTransformer, TimeXer, and TimeBridge-on seven public time series datasets. Using MSE and Mean Absolute Error (MAE) as evaluation metrics, the results demonstrate that OCE-TS consistently outperforms benchmark models. The codeis publicly available at: https://github.com/Shi-hm/OCE-TS.

Beyond MSE: Ordinal Cross-Entropy for Probabilistic Time Series Forecasting

TL;DR

This work tackles the limitations of MSE-based time-series forecasting by introducing OCE-TS, which reframes regression as an ordinal probability problem to enable uncertainty quantification. It leverages Target-to-Probability Transformation, a deep ordinal classifier built on a DLinear backbone, and Probability-to-Value Reconstruction to deliver both accurate point forecasts and calibrated distributions. Through influence-function analysis and extensive experiments on seven public datasets, OCE-TS demonstrates superior accuracy, robustness to noise, and stable performance across diverse settings, outperforming strong baselines. The approach offers a practical, scalable path toward probabilistic forecasting with inherent ordinal structure preservation.

Abstract

Time series forecasting is an important task that involves analyzing temporal dependencies and underlying patterns (such as trends, cyclicality, and seasonality) in historical data to predict future values or trends. Current deep learning-based forecasting models primarily employ Mean Squared Error (MSE) loss functions for regression modeling. Despite enabling direct value prediction, this method offers no uncertainty estimation and exhibits poor outlier robustness. To address these limitations, we propose OCE-TS, a novel ordinal classification approach for time series forecasting that replaces MSE with Ordinal Cross-Entropy (OCE) loss, preserving prediction order while quantifying uncertainty through probability output. Specifically, OCE-TS begins by discretizing observed values into ordered intervals and deriving their probabilities via a parametric distribution as supervision signals. Using a simple linear model, we then predict probability distributions for each timestep. The OCE loss is computed between the cumulative distributions of predicted and ground-truth probabilities, explicitly preserving ordinal relationships among forecasted values. Through theoretical analysis using influence functions, we establish that cross-entropy (CE) loss exhibits superior stability and outlier robustness compared to MSE loss. Empirically, we compared OCE-TS with five baseline models-Autoformer, DLinear, iTransformer, TimeXer, and TimeBridge-on seven public time series datasets. Using MSE and Mean Absolute Error (MAE) as evaluation metrics, the results demonstrate that OCE-TS consistently outperforms benchmark models. The codeis publicly available at: https://github.com/Shi-hm/OCE-TS.

Paper Structure

This paper contains 50 sections, 3 theorems, 83 equations, 10 figures, 6 tables, 1 algorithm.

Key Result

Theorem 1

Suppose that the covariance matrix $\bm{\Sigma}_X = \mathbb{E}[\bm{x}\bm{x}^\top]$ is positive definite with $\lambda_{\min}(\bm{\Sigma}_X) > 0$ and the expected Softmax matrix $\bm{P}$ is positive definite with $\lambda_{\min}(\bm{P}) > 0$. Assume the input features have finite second moments ($\ma where $\kappa_2(\bm{\Sigma}_X) = \|\bm{\Sigma}_X\|_2\|\bm{\Sigma}_X^{-1}\|_2$ represents the spectr

Figures (10)

  • Figure 1: The Framework of Ordinal Cross-Entropy Loss-Based Time Series Forecasting (OCE-TS)
  • Figure 2: Model Fitting Results of DLinear and Ours
  • Figure 3: Distribution Comparison on ETT Datasets
  • Figure 4: Lookback Window Sizes and Parameter Sensitivity
  • Figure 5: The Comparison between CE Loss and OCE Loss
  • ...and 5 more figures

Theorems & Definitions (8)

  • Definition 1: Ordinal Cross-Entropy Loss
  • Theorem 1: Influence Function Growth Rate Comparison
  • Definition 2: Standard Cross-Entropy Loss (General Form)
  • Definition 3: Influence Function
  • Lemma 1
  • proof
  • Theorem 2
  • proof