Table of Contents
Fetching ...

AMLNet: Adversarial Mutual Learning Neural Network for Non-AutoRegressive Multi-Horizon Time Series Forecasting

Yang Lin

TL;DR

AMLNet tackles non-autoregressive multi-horizon forecasting by training deep AR and deep NAR decoders as ensemble teachers to distill knowledge into a lightweight shallow NAR student. It introduces two online KD mechanisms: outcome-driven KD, which dynamically weights teacher guidance based on performance, and hint-driven KD, which uses adversarial training to distill hidden-state distributions, addressing the common NAR issue of unrealistic trajectories. The framework leverages a shared encoder and an adversarially trained hidden-state distillation module to produce continuous, coherent forecasts while maintaining fast inference. Empirical results on four public datasets show AMLNet outperforms strong AR/NAR baselines and existing online KD methods in probabilistic accuracy and latency, highlighting its practical impact for scalable, accurate multi-horizon forecasting.

Abstract

Multi-horizon time series forecasting, crucial across diverse domains, demands high accuracy and speed. While AutoRegressive (AR) models excel in short-term predictions, they suffer speed and error issues as the horizon extends. Non-AutoRegressive (NAR) models suit long-term predictions but struggle with interdependence, yielding unrealistic results. We introduce AMLNet, an innovative NAR model that achieves realistic forecasts through an online Knowledge Distillation (KD) approach. AMLNet harnesses the strengths of both AR and NAR models by training a deep AR decoder and a deep NAR decoder in a collaborative manner, serving as ensemble teachers that impart knowledge to a shallower NAR decoder. This knowledge transfer is facilitated through two key mechanisms: 1) outcome-driven KD, which dynamically weights the contribution of KD losses from the teacher models, enabling the shallow NAR decoder to incorporate the ensemble's diversity; and 2) hint-driven KD, which employs adversarial training to extract valuable insights from the model's hidden states for distillation. Extensive experimentation showcases AMLNet's superiority over conventional AR and NAR models, thereby presenting a promising avenue for multi-horizon time series forecasting that enhances accuracy and expedites computation.

AMLNet: Adversarial Mutual Learning Neural Network for Non-AutoRegressive Multi-Horizon Time Series Forecasting

TL;DR

AMLNet tackles non-autoregressive multi-horizon forecasting by training deep AR and deep NAR decoders as ensemble teachers to distill knowledge into a lightweight shallow NAR student. It introduces two online KD mechanisms: outcome-driven KD, which dynamically weights teacher guidance based on performance, and hint-driven KD, which uses adversarial training to distill hidden-state distributions, addressing the common NAR issue of unrealistic trajectories. The framework leverages a shared encoder and an adversarially trained hidden-state distillation module to produce continuous, coherent forecasts while maintaining fast inference. Empirical results on four public datasets show AMLNet outperforms strong AR/NAR baselines and existing online KD methods in probabilistic accuracy and latency, highlighting its practical impact for scalable, accurate multi-horizon forecasting.

Abstract

Multi-horizon time series forecasting, crucial across diverse domains, demands high accuracy and speed. While AutoRegressive (AR) models excel in short-term predictions, they suffer speed and error issues as the horizon extends. Non-AutoRegressive (NAR) models suit long-term predictions but struggle with interdependence, yielding unrealistic results. We introduce AMLNet, an innovative NAR model that achieves realistic forecasts through an online Knowledge Distillation (KD) approach. AMLNet harnesses the strengths of both AR and NAR models by training a deep AR decoder and a deep NAR decoder in a collaborative manner, serving as ensemble teachers that impart knowledge to a shallower NAR decoder. This knowledge transfer is facilitated through two key mechanisms: 1) outcome-driven KD, which dynamically weights the contribution of KD losses from the teacher models, enabling the shallow NAR decoder to incorporate the ensemble's diversity; and 2) hint-driven KD, which employs adversarial training to extract valuable insights from the model's hidden states for distillation. Extensive experimentation showcases AMLNet's superiority over conventional AR and NAR models, thereby presenting a promising avenue for multi-horizon time series forecasting that enhances accuracy and expedites computation.
Paper Structure (22 sections, 21 equations, 3 figures, 6 tables, 2 algorithms)

This paper contains 22 sections, 21 equations, 3 figures, 6 tables, 2 algorithms.

Figures (3)

  • Figure 1: AMLNet comprises an encoder, P1, P2, and S decoders, with each P1 and P2 layer accompanied by a dedicated discriminator. P1 operates as an AR component, while P2 and S function as NAR components. The solid lines depict the feedforward process, while dashed lines represent the data flow for knowledge distillation.
  • Figure 2: Hidden state cosine distance of: (a) DeepAR; (b) Informer and (c) AMLNet. (d) Ground truth vs predictions.
  • Figure 3: $\rho$0.5-loss of DeepAR, Informer and AMLNet with various forecasting horizon on (a) Sanyo set and (b) Hanergy set