Long or Short or Both? An Exploration on Lookback Time Windows of Behavioral Features in Product Search Ranking
Qi Liu, Atul Singh, Jingbo Liu, Cun Mu, Zheng Yan, Jan Pedersen
TL;DR
Problem: determine how lookback time windows for (query, product)-level behavioral features affect product search ranking in eCommerce. Method: measure long ($|T|=730$) and short ($|T|=30$) windows, using a Bayesian-smoothed posterior $br_{q,p} = (sum_{t} b_{q,p}^{(t)} + α) / (sum_{t} e_{q,p}^{(t)} + α + β)$ to generate features, and evaluate Baseline, Model A, Model B, and Model C within a tree-based ranking framework; The key innovation is to add query-level vertical signals to guide the integration of features from different windows. Results: long windows help stable verticals like Food/Consumables, short windows help dynamic ones like Fashion/ETS; naive combination harms performance, but vertical-guided multi-window integration (Model C) yields statistically significant gains in engagement and GMV in online A/B tests. Significance: demonstrates a scalable approach to more robust ranking by leveraging temporal diversity and query context, with practical benefits for eCommerce search, and sets the stage for broader horizon and signal expansion.
Abstract
Customer shopping behavioral features are core to product search ranking models in eCommerce. In this paper, we investigate the effect of lookback time windows when aggregating these features at the (query, product) level over history. By studying the pros and cons of using long and short time windows, we propose a novel approach to integrating these historical behavioral features of different time windows. In particular, we address the criticality of using query-level vertical signals in ranking models to effectively aggregate all information from different behavioral features. Anecdotal evidence for the proposed approach is also provided using live product search traffic on Walmart.com.
