Enhancing Time Series Forecasting via Logic-Inspired Regularization
Jianqi Zhang, Jingyao Wang, Xingchen Shen, Wenwen Qiang
TL;DR
The paper targets a key limitation of Transformer-based TSF: treating all token dependencies equally, which hurts performance when dependencies vary by forecasting scenario. It introduces Attention Logic Regularization (Attn-L-Reg), a plug-in sparsity regularizer grounded in a logic-inspired notion of atomic token representations to enforce minimal, effective dependencies. The authors provide a theoretical generalization bound showing the benefit of L1 regularization on attention and demonstrate strong empirical gains across six real-world datasets, with analyses showing reduced redundancy in attention. This approach yields a practical, model-agnostic improvement for TSF that enhances generalization and interpretability by focusing on the most informative token dependencies.
Abstract
Time series forecasting (TSF) plays a crucial role in many applications. Transformer-based methods are one of the mainstream techniques for TSF. Existing methods treat all token dependencies equally. However, we find that the effectiveness of token dependencies varies across different forecasting scenarios, and existing methods ignore these differences, which affects their performance. This raises two issues: (1) What are effective token dependencies? (2) How can we learn effective dependencies? From a logical perspective, we align Transformer-based TSF methods with the logical framework and define effective token dependencies as those that ensure the tokens as atomic formulas (Issue 1). We then align the learning process of Transformer methods with the process of obtaining atomic formulas in logic, which inspires us to design a method for learning these effective dependencies (Issue 2). Specifically, we propose Attention Logic Regularization (Attn-L-Reg), a plug-and-play method that guides the model to use fewer but more effective dependencies by making the attention map sparse, thereby ensuring the tokens as atomic formulas and improving prediction performance. Extensive experiments and theoretical analysis confirm the effectiveness of Attn-L-Reg.
