Online Learning for Recommendations at Grubhub
Alex Egg
TL;DR
This work tackles the challenge of keeping large-scale Grubhub recommender systems fresh and cost-efficient in production by enabling online incremental learning through transfer learning. It combines stateful online updates, pre-training with offline data, and hash-based embedding schemes to handle non-stationary item categories, aiming to balance drift responsiveness with computational efficiency. The authors report a +20% PTR increase and a 45x reduction in cloud costs in AB tests, highlighting the practical impact of online, stateful learning for large-scale e-commerce recommendations. Overall, the paper demonstrates that offline-to-online transitions, when carefully implemented with pre-training and hashing strategies, can deliver rapid adaptation and substantial cost savings in production recommender systems.
Abstract
We propose a method to easily modify existing offline Recommender Systems to run online using Transfer Learning. Online Learning for Recommender Systems has two main advantages: quality and scale. Like many Machine Learning algorithms in production if not regularly retrained will suffer from Concept Drift. A policy that is updated frequently online can adapt to drift faster than a batch system. This is especially true for user-interaction systems like recommenders where the underlying distribution can shift drastically to follow user behaviour. As a platform grows rapidly like Grubhub, the cost of running batch training jobs becomes material. A shift from stateless batch learning offline to stateful incremental learning online can recover, for example, at Grubhub, up to a 45x cost savings and a +20% metrics increase. There are a few challenges to overcome with the transition to online stateful learning, namely convergence, non-stationary embeddings and off-policy evaluation, which we explore from our experiences running this system in production.
