Let Experts Feel Uncertainty: A Multi-Expert Label Distribution Approach to Probabilistic Time Series Forecasting
Zhen Zhou, Zhirui Wang, Qi Hong, Yunyang Shi, Ziyuan Gu, Zhiyuan Liu
TL;DR
This work addresses the challenge of obtaining accurate yet interpretable probabilistic forecasts for time series. It introduces two frameworks—Multi-Expert LDL and Pattern-Aware LDL-MoE—that fuse mixture-of-experts architectures with non-parametric distributional learning, enabling rich, regime-specific uncertainty quantification via $MMD$. By decomposing uncertainty sources and guiding expert specialization, the authors demonstrate state-of-the-art performance on aggregated M5 sales data, with the continuous multi-expert variant offering the best overall metrics and the pattern-aware variant delivering enhanced interpretability through component-wise analysis. The proposed approaches balance predictive accuracy with actionable uncertainty attribution, making them well-suited for real-world forecasting where risk-aware decisions are critical. The work lays a foundation for scalable, interpretable probabilistic forecasting in complex temporal domains, while acknowledging computational overhead as a future bottleneck and area for optimization.
Abstract
Time series forecasting in real-world applications requires both high predictive accuracy and interpretable uncertainty quantification. Traditional point prediction methods often fail to capture the inherent uncertainty in time series data, while existing probabilistic approaches struggle to balance computational efficiency with interpretability. We propose a novel Multi-Expert Learning Distributional Labels (LDL) framework that addresses these challenges through mixture-of-experts architectures with distributional learning capabilities. Our approach introduces two complementary methods: (1) Multi-Expert LDL, which employs multiple experts with different learned parameters to capture diverse temporal patterns, and (2) Pattern-Aware LDL-MoE, which explicitly decomposes time series into interpretable components (trend, seasonality, changepoints, volatility) through specialized sub-experts. Both frameworks extend traditional point prediction to distributional learning, enabling rich uncertainty quantification through Maximum Mean Discrepancy (MMD). We evaluate our methods on aggregated sales data derived from the M5 dataset, demonstrating superior performance compared to baseline approaches. The continuous Multi-Expert LDL achieves the best overall performance, while the Pattern-Aware LDL-MoE provides enhanced interpretability through component-wise analysis. Our frameworks successfully balance predictive accuracy with interpretability, making them suitable for real-world forecasting applications where both performance and actionable insights are crucial.
