Explicit Uncertainty Modeling for Video Watch Time Prediction
Shanshan Wu, Shuchang Liu, Shuai Zhang, Xiaoyu Yang, Xiang Li, Lantao Hu, Han Li
TL;DR
This work tackles the stochastic nature of user watch-time in video recommendations by proposing EXUM, an explicit uncertainty modeling framework that adds a confidence predictor to existing watch-time backbones and trains it with adversarial confidence maximization. The approach yields a principled p′ that mixes backbone predictions with ground-truth evidence, mitigating the uncontrolled uncertainty paradox observed in prior distribution-modeling methods. The method is validated through online A/B testing on a large industrial platform and extensive offline experiments on WeChat and KuaiRand, showing consistent improvements in MAE and XAUC across backbones (D2Q and CREAD) and datasets. The results demonstrate the practical viability of explicit uncertainty control to improve watch-time predictions and, by extension, recommendation quality, with a flexible design that can adapt to various distribution modeling backbones.
Abstract
In video recommendation, a critical component that determines the system's recommendation accuracy is the watch-time prediction module, since how long a user watches a video directly reflects personalized preferences. One of the key challenges of this problem is the user's stochastic watch-time behavior. To improve the prediction accuracy for such an uncertain behavior, existing approaches show that one can either reduce the noise through duration bias modeling or formulate a distribution modeling task to capture the uncertainty. However, the uncontrolled uncertainty is not always equally distributed across users and videos, inducing a balancing paradox between the model accuracy and the ability to capture out-of-distribution samples. In practice, we find that the uncertainty of the watch-time prediction model also provides key information about user behavior, which, in turn, could benefit the prediction task itself. Following this notion, we derive an explicit uncertainty modeling strategy for the prediction model and propose an adversarial optimization framework that can better exploit the user watch-time behavior. This framework has been deployed online on an industrial video sharing platform that serves hundreds of millions of daily active users, which obtains a significant increase in users' video watch time by 0.31% through the online A/B test. Furthermore, extended offline experiments on two public datasets verify the effectiveness of the proposed framework across various watch-time prediction backbones.
