Table of Contents
Fetching ...

Multi-Granularity Distribution Modeling for Video Watch Time Prediction via Exponential-Gaussian Mixture Network

Xu Zhao, Ruibo Ma, Jiaqi Chen, Weiqi Zhao, Ping Yang, Yao Hu

TL;DR

This work addresses the challenge of predicting watch time in short-video feeds, where distributions are both highly skewed at coarse granularity and multimodal at finer granularity. It introduces the Exponential-Gaussian Mixture (EGM) distribution and its neural parameterization, the Exponential-Gaussian Mixture Network (EGMN), which uses a shared hidden representation and a Mixture Parameter Generator to estimate an exponential component for skewness and multiple Gaussian components for diversity. The model is trained with a composite objective combining maximum likelihood, entropy regularization, and a regression loss, and it provides end-to-end inference of full conditional distributions with the mean used for prediction. Extensive offline experiments across multiple datasets and online A/B tests on a commercial platform show that EGMN achieves superior distribution fitting and predictive accuracy, demonstrably improving watch time and engagement while offering robust quick-skipping detection and distributional insights. The approach, open-sourced on GitHub, offers a practical, model-agnostic framework for distribution-aware regression in real-world recommender systems and can be extended to other tasks with complex multi-granularity distributions.

Abstract

Accurate watch time prediction is crucial for enhancing user engagement in streaming short-video platforms, although it is challenged by complex distribution characteristics across multi-granularity levels. Through systematic analysis of real-world industrial data, we uncover two critical challenges in watch time prediction from a distribution aspect: (1) coarse-grained skewness induced by a significant concentration of quick-skips1, (2) fine-grained diversity arising from various user-video interaction patterns. Consequently, we assume that the watch time follows the Exponential-Gaussian Mixture (EGM) distribution, where the exponential and Gaussian components respectively characterize the skewness and diversity. Accordingly, an Exponential-Gaussian Mixture Network (EGMN) is proposed for the parameterization of EGM distribution, which consists of two key modules: a hidden representation encoder and a mixture parameter generator. We conducted extensive offline experiments on public datasets and online A/B tests on the industrial short-video feeding scenario of Xiaohongshu App to validate the superiority of EGMN compared with existing state-of-the-art methods. Remarkably, comprehensive experimental results have proven that EGMN exhibits excellent distribution fitting ability across coarse-to-fine-grained levels. We open source related code on Github: https://github.com/BestActionNow/EGMN.

Multi-Granularity Distribution Modeling for Video Watch Time Prediction via Exponential-Gaussian Mixture Network

TL;DR

This work addresses the challenge of predicting watch time in short-video feeds, where distributions are both highly skewed at coarse granularity and multimodal at finer granularity. It introduces the Exponential-Gaussian Mixture (EGM) distribution and its neural parameterization, the Exponential-Gaussian Mixture Network (EGMN), which uses a shared hidden representation and a Mixture Parameter Generator to estimate an exponential component for skewness and multiple Gaussian components for diversity. The model is trained with a composite objective combining maximum likelihood, entropy regularization, and a regression loss, and it provides end-to-end inference of full conditional distributions with the mean used for prediction. Extensive offline experiments across multiple datasets and online A/B tests on a commercial platform show that EGMN achieves superior distribution fitting and predictive accuracy, demonstrably improving watch time and engagement while offering robust quick-skipping detection and distributional insights. The approach, open-sourced on GitHub, offers a practical, model-agnostic framework for distribution-aware regression in real-world recommender systems and can be extended to other tasks with complex multi-granularity distributions.

Abstract

Accurate watch time prediction is crucial for enhancing user engagement in streaming short-video platforms, although it is challenged by complex distribution characteristics across multi-granularity levels. Through systematic analysis of real-world industrial data, we uncover two critical challenges in watch time prediction from a distribution aspect: (1) coarse-grained skewness induced by a significant concentration of quick-skips1, (2) fine-grained diversity arising from various user-video interaction patterns. Consequently, we assume that the watch time follows the Exponential-Gaussian Mixture (EGM) distribution, where the exponential and Gaussian components respectively characterize the skewness and diversity. Accordingly, an Exponential-Gaussian Mixture Network (EGMN) is proposed for the parameterization of EGM distribution, which consists of two key modules: a hidden representation encoder and a mixture parameter generator. We conducted extensive offline experiments on public datasets and online A/B tests on the industrial short-video feeding scenario of Xiaohongshu App to validate the superiority of EGMN compared with existing state-of-the-art methods. Remarkably, comprehensive experimental results have proven that EGMN exhibits excellent distribution fitting ability across coarse-to-fine-grained levels. We open source related code on Github: https://github.com/BestActionNow/EGMN.

Paper Structure

This paper contains 32 sections, 9 equations, 7 figures, 3 tables.

Figures (7)

  • Figure 1: Investigation of watch time distribution following the coarse-to-fine paradigm in our real industrial scenario.
  • Figure 2: The proposed EGMN framework.
  • Figure 3: AUC comparison among three binary classification tasks under different quick-skipping thresholds on Indust.
  • Figure 4: Accuracy variation curve of EGMN with different numbers of Gaussian components on KuaiRec.
  • Figure 5: Comparison of KL Divergence and MAE across different methods over training epochs on Indust.
  • ...and 2 more figures