Table of Contents
Fetching ...

Monolith: Real Time Recommendation System With Collisionless Embedding Table

Zhuoran Liu, Leqi Zou, Xuan Zou, Caihua Wang, Biao Zhang, Da Tang, Bolin Zhu, Yijie Zhu, Peng Wu, Ke Wang, Youlong Cheng

TL;DR

The paper addresses real-time recommendation challenges arising from sparse, high-cardinality features and non-stationary data distributions (concept drift).It introduces Monolith, a system featuring collisionless embeddings built on Cuckoo Hashing, expirable embeddings, and a streaming-based online training architecture with fault-tolerant synchronization between training and serving components.Key contributions include a collisionless embedding table, a two-stage online training workflow, and an incremental, low-overhead parameter synchronization strategy, demonstrated in production settings and large-scale experiments.Results show improved AUC and robustness to drift compared to collision-prone baselines, with significant online performance benefits from frequent parameter updates validated by live A/B tests.

Abstract

Building a scalable and real-time recommendation system is vital for many businesses driven by time-sensitive customer feedback, such as short-videos ranking or online ads. Despite the ubiquitous adoption of production-scale deep learning frameworks like TensorFlow or PyTorch, these general-purpose frameworks fall short of business demands in recommendation scenarios for various reasons: on one hand, tweaking systems based on static parameters and dense computations for recommendation with dynamic and sparse features is detrimental to model quality; on the other hand, such frameworks are designed with batch-training stage and serving stage completely separated, preventing the model from interacting with customer feedback in real-time. These issues led us to reexamine traditional approaches and explore radically different design choices. In this paper, we present Monolith, a system tailored for online training. Our design has been driven by observations of our application workloads and production environment that reflects a marked departure from other recommendations systems. Our contributions are manifold: first, we crafted a collisionless embedding table with optimizations such as expirable embeddings and frequency filtering to reduce its memory footprint; second, we provide an production-ready online training architecture with high fault-tolerance; finally, we proved that system reliability could be traded-off for real-time learning. Monolith has successfully landed in the BytePlus Recommend product.

Monolith: Real Time Recommendation System With Collisionless Embedding Table

TL;DR

The paper addresses real-time recommendation challenges arising from sparse, high-cardinality features and non-stationary data distributions (concept drift).It introduces Monolith, a system featuring collisionless embeddings built on Cuckoo Hashing, expirable embeddings, and a streaming-based online training architecture with fault-tolerant synchronization between training and serving components.Key contributions include a collisionless embedding table, a two-stage online training workflow, and an incremental, low-overhead parameter synchronization strategy, demonstrated in production settings and large-scale experiments.Results show improved AUC and robustness to drift compared to collision-prone baselines, with significant online performance benefits from frequent parameter updates validated by live A/B tests.

Abstract

Building a scalable and real-time recommendation system is vital for many businesses driven by time-sensitive customer feedback, such as short-videos ranking or online ads. Despite the ubiquitous adoption of production-scale deep learning frameworks like TensorFlow or PyTorch, these general-purpose frameworks fall short of business demands in recommendation scenarios for various reasons: on one hand, tweaking systems based on static parameters and dense computations for recommendation with dynamic and sparse features is detrimental to model quality; on the other hand, such frameworks are designed with batch-training stage and serving stage completely separated, preventing the model from interacting with customer feedback in real-time. These issues led us to reexamine traditional approaches and explore radically different design choices. In this paper, we present Monolith, a system tailored for online training. Our design has been driven by observations of our application workloads and production environment that reflects a marked departure from other recommendations systems. Our contributions are manifold: first, we crafted a collisionless embedding table with optimizations such as expirable embeddings and frequency filtering to reduce its memory footprint; second, we provide an production-ready online training architecture with high fault-tolerance; finally, we proved that system reliability could be traded-off for real-time learning. Monolith has successfully landed in the BytePlus Recommend product.
Paper Structure (19 sections, 11 figures, 1 algorithm)

This paper contains 19 sections, 11 figures, 1 algorithm.

Figures (11)

  • Figure 1: Monolith Online Training Architecture.
  • Figure 2: Worker-PS Architecture.
  • Figure 3: Cuckoo HashMap.
  • Figure 4: Streaming Engine.
  • Figure 5: Online Joiner.
  • ...and 6 more figures