Beyond MSE: Ordinal Cross-Entropy for Probabilistic Time Series Forecasting
Jieting Wang, Huimei Shi, Feijiang Li, Xiaolei Shang
TL;DR
This work tackles the limitations of MSE-based time-series forecasting by introducing OCE-TS, which reframes regression as an ordinal probability problem to enable uncertainty quantification. It leverages Target-to-Probability Transformation, a deep ordinal classifier built on a DLinear backbone, and Probability-to-Value Reconstruction to deliver both accurate point forecasts and calibrated distributions. Through influence-function analysis and extensive experiments on seven public datasets, OCE-TS demonstrates superior accuracy, robustness to noise, and stable performance across diverse settings, outperforming strong baselines. The approach offers a practical, scalable path toward probabilistic forecasting with inherent ordinal structure preservation.
Abstract
Time series forecasting is an important task that involves analyzing temporal dependencies and underlying patterns (such as trends, cyclicality, and seasonality) in historical data to predict future values or trends. Current deep learning-based forecasting models primarily employ Mean Squared Error (MSE) loss functions for regression modeling. Despite enabling direct value prediction, this method offers no uncertainty estimation and exhibits poor outlier robustness. To address these limitations, we propose OCE-TS, a novel ordinal classification approach for time series forecasting that replaces MSE with Ordinal Cross-Entropy (OCE) loss, preserving prediction order while quantifying uncertainty through probability output. Specifically, OCE-TS begins by discretizing observed values into ordered intervals and deriving their probabilities via a parametric distribution as supervision signals. Using a simple linear model, we then predict probability distributions for each timestep. The OCE loss is computed between the cumulative distributions of predicted and ground-truth probabilities, explicitly preserving ordinal relationships among forecasted values. Through theoretical analysis using influence functions, we establish that cross-entropy (CE) loss exhibits superior stability and outlier robustness compared to MSE loss. Empirically, we compared OCE-TS with five baseline models-Autoformer, DLinear, iTransformer, TimeXer, and TimeBridge-on seven public time series datasets. Using MSE and Mean Absolute Error (MAE) as evaluation metrics, the results demonstrate that OCE-TS consistently outperforms benchmark models. The codeis publicly available at: https://github.com/Shi-hm/OCE-TS.
