F-FOMAML: GNN-Enhanced Meta-Learning for Peak Period Demand Forecasting with Proxy Data
Zexing Xu, Linjun Zhang, Sitan Yang, Rasoul Etesami, Hanghang Tong, Huan Zhang, Jiawei Han
TL;DR
This work tackles peak-period demand forecasting under severe data scarcity by combining GNN-based proxy data with a graph-augmented meta-learning framework called F-FOMAML. By extracting task embeddings through a GNN forecaster and modulating a meta-learned model with FiLM layers, the approach achieves rapid adaptation to new peak tasks while leveraging related tasks for improved generalization. Theoretical analysis provides excess-risk bounds that justify the bias-variance trade-off induced by proxy data, and empirical results on vending-machine and JD.com datasets demonstrate substantial MAE improvements over strong baselines, including notable gains over GNN-only benchmarks. The method offers a scalable, domain-agnostic blueprint for data-scarce forecasting in retail and beyond, with potential applications in settings like real-time promotions and cold-start scenarios.
Abstract
Demand prediction is a crucial task for e-commerce and physical retail businesses, especially during high-stake sales events. However, the limited availability of historical data from these peak periods poses a significant challenge for traditional forecasting methods. In this paper, we propose a novel approach that leverages strategically chosen proxy data reflective of potential sales patterns from similar entities during non-peak periods, enriched by features learned from a graph neural networks (GNNs)-based forecasting model, to predict demand during peak events. We formulate the demand prediction as a meta-learning problem and develop the Feature-based First-Order Model-Agnostic Meta-Learning (F-FOMAML) algorithm that leverages proxy data from non-peak periods and GNN-generated relational metadata to learn feature-specific layer parameters, thereby adapting to demand forecasts for peak events. Theoretically, we show that by considering domain similarities through task-specific metadata, our model achieves improved generalization, where the excess risk decreases as the number of training tasks increases. Empirical evaluations on large-scale industrial datasets demonstrate the superiority of our approach. Compared to existing state-of-the-art models, our method demonstrates a notable improvement in demand prediction accuracy, reducing the Mean Absolute Error by 26.24% on an internal vending machine dataset and by 1.04% on the publicly accessible JD.com dataset.
