Table of Contents
Fetching ...

STTM: A New Approach Based Spatial-Temporal Transformer And Memory Network For Real-time Pressure Signal In On-demand Food Delivery

Jiang Wang, Haibin Wei, Xiaowei Xu, Jiacheng Shi, Jian Nie, Longzhi Du, Taixu Jiang

TL;DR

This work tackles real-time RPS prediction for on-demand food delivery by proposing STTM, a framework that combines a Spatio-Temporal Transformer with a Memory Network to capture district-level spatio-temporal dynamics and heightened sensitivity to anomalous events. The model introduces a relative coordinate-aware encoding via Temporal Position Embedding and Spatial Position Embedding, enabling precise integration of time and 2D district geometry, while the Memory Network emphasizes features related to peak periods, weather, and other anomalies. Empirical results on a real Ele.me dataset show STTM outperforms baselines in MAE, MSE, and AMAE by notable margins (up to 9.66%/14.13%/7.41%), and ablation studies confirm the contribution of each component. The method has been deployed in a large-scale OFD platform, demonstrating practical impact for proactive system management during high-pressure periods and adverse conditions.

Abstract

On-demand Food Delivery (OFD) services have become very common around the world. For example, on the Ele.me platform, users place more than 15 million food orders every day. Predicting the Real-time Pressure Signal (RPS) is crucial for OFD services, as it is primarily used to measure the current status of pressure on the logistics system. When RPS rises, the pressure increases, and the platform needs to quickly take measures to prevent the logistics system from being overloaded. Usually, the average delivery time for all orders within a business district is used to represent RPS. Existing research on OFD services primarily focuses on predicting the delivery time of orders, while relatively less attention has been given to the study of the RPS. Previous research directly applies general models such as DeepFM, RNN, and GNN for prediction, but fails to adequately utilize the unique temporal and spatial characteristics of OFD services, and faces issues with insufficient sensitivity during sudden severe weather conditions or peak periods. To address these problems, this paper proposes a new method based on Spatio-Temporal Transformer and Memory Network (STTM). Specifically, we use a novel Spatio-Temporal Transformer structure to learn logistics features across temporal and spatial dimensions and encode the historical information of a business district and its neighbors, thereby learning both temporal and spatial information. Additionally, a Memory Network is employed to increase sensitivity to abnormal events. Experimental results on the real-world dataset show that STTM significantly outperforms previous methods in both offline experiments and the online A/B test, demonstrating the effectiveness of this method.

STTM: A New Approach Based Spatial-Temporal Transformer And Memory Network For Real-time Pressure Signal In On-demand Food Delivery

TL;DR

This work tackles real-time RPS prediction for on-demand food delivery by proposing STTM, a framework that combines a Spatio-Temporal Transformer with a Memory Network to capture district-level spatio-temporal dynamics and heightened sensitivity to anomalous events. The model introduces a relative coordinate-aware encoding via Temporal Position Embedding and Spatial Position Embedding, enabling precise integration of time and 2D district geometry, while the Memory Network emphasizes features related to peak periods, weather, and other anomalies. Empirical results on a real Ele.me dataset show STTM outperforms baselines in MAE, MSE, and AMAE by notable margins (up to 9.66%/14.13%/7.41%), and ablation studies confirm the contribution of each component. The method has been deployed in a large-scale OFD platform, demonstrating practical impact for proactive system management during high-pressure periods and adverse conditions.

Abstract

On-demand Food Delivery (OFD) services have become very common around the world. For example, on the Ele.me platform, users place more than 15 million food orders every day. Predicting the Real-time Pressure Signal (RPS) is crucial for OFD services, as it is primarily used to measure the current status of pressure on the logistics system. When RPS rises, the pressure increases, and the platform needs to quickly take measures to prevent the logistics system from being overloaded. Usually, the average delivery time for all orders within a business district is used to represent RPS. Existing research on OFD services primarily focuses on predicting the delivery time of orders, while relatively less attention has been given to the study of the RPS. Previous research directly applies general models such as DeepFM, RNN, and GNN for prediction, but fails to adequately utilize the unique temporal and spatial characteristics of OFD services, and faces issues with insufficient sensitivity during sudden severe weather conditions or peak periods. To address these problems, this paper proposes a new method based on Spatio-Temporal Transformer and Memory Network (STTM). Specifically, we use a novel Spatio-Temporal Transformer structure to learn logistics features across temporal and spatial dimensions and encode the historical information of a business district and its neighbors, thereby learning both temporal and spatial information. Additionally, a Memory Network is employed to increase sensitivity to abnormal events. Experimental results on the real-world dataset show that STTM significantly outperforms previous methods in both offline experiments and the online A/B test, demonstrating the effectiveness of this method.
Paper Structure (20 sections, 17 equations, 3 figures, 6 tables)

This paper contains 20 sections, 17 equations, 3 figures, 6 tables.

Figures (3)

  • Figure 1: Visualization of the spatial distribution of RPS in the OFD domain and urban traffic prediction. (a) The spatial structure of the RPS is a circular distribution of business districts, where adjacent districts will influence each other. (b) Urban traffic prediction deals with road network data sampled from sensors.
  • Figure 2: The framework of the proposed STTM method.
  • Figure 3: The impact of hyper-parameters on the performance of the model. (a) is the result of hyper-parameters N and M, and (b) is the result of $L_{mem}$.