Table of Contents
Fetching ...

Exploiting Fine-Grained Skip Behaviors for Micro-Video Recommendation

Sanghyuck Lee, Sangkeun Park, Jaesung Lee

TL;DR

This study classify skip interactions occurring within a short time as negatives, while those occurring after a delay are categorized as less positive, and proposes a dual-level graph and hierarchical ranking loss to effectively learn these fine-grained interactions.

Abstract

The growing trend of sharing short videos on social media platforms, where users capture and share moments from their daily lives, has led to an increase in research efforts focused on micro-video recommendations. However, conventional methods oversimplify the modeling of skip behavior, categorizing interactions solely as positive or negative based on whether skipping occurs. This study was motivated by the importance of the first few seconds of micro-videos, leading to a refinement of signals into three distinct categories: highly positive, less positive, and negative. Specifically, we classify skip interactions occurring within a short time as negatives, while those occurring after a delay are categorized as less positive. The proposed dual-level graph and hierarchical ranking loss are designed to effectively learn these fine-grained interactions. Our experiments demonstrated that the proposed method outperformed three conventional methods across eight evaluation measures on two public datasets.

Exploiting Fine-Grained Skip Behaviors for Micro-Video Recommendation

TL;DR

This study classify skip interactions occurring within a short time as negatives, while those occurring after a delay are categorized as less positive, and proposes a dual-level graph and hierarchical ranking loss to effectively learn these fine-grained interactions.

Abstract

The growing trend of sharing short videos on social media platforms, where users capture and share moments from their daily lives, has led to an increase in research efforts focused on micro-video recommendations. However, conventional methods oversimplify the modeling of skip behavior, categorizing interactions solely as positive or negative based on whether skipping occurs. This study was motivated by the importance of the first few seconds of micro-videos, leading to a refinement of signals into three distinct categories: highly positive, less positive, and negative. Specifically, we classify skip interactions occurring within a short time as negatives, while those occurring after a delay are categorized as less positive. The proposed dual-level graph and hierarchical ranking loss are designed to effectively learn these fine-grained interactions. Our experiments demonstrated that the proposed method outperformed three conventional methods across eight evaluation measures on two public datasets.

Paper Structure

This paper contains 22 sections, 20 equations, 5 figures, 3 tables.

Figures (5)

  • Figure 1: Distribution of video duration and playing time of the potent skipped interactions in two datasets, MVA 10.1145/3539618.3591713 and KuaiRand-Pure 10.1145/3511808.3557624. The figures include only interactions where the playing time is shorter than the video duration; thus, playing time can be considered indicative of the timing of skip behaviors. Most skips occur within the first five seconds of the video, while the distribution of video durations remains relatively uniform. The conventional approach 10.1145/3539618.3591713 based on playing time views incomplete viewing as negative, ignoring that users might form positive impressions early, causing slightly delayed skips. The histogram bin range has been truncated to 0-60 seconds for the sake of clarity.
  • Figure 2: Dual-level positive graph construction and negative interest training by ranking loss. The total interactions are initially divided into Highly Positive Interactions and Skip Behavior Interactions based on Duration/Playing Time. Then, Skip Behavior Interactions are further divided into Less Positive Interactions and Negative Interactions, with Playing Time 5s as the threshold. Less Positive Interactions indicate a preference to continue watching the video beyond the initial 5 seconds, a period where skips are most frequent. Highly Positive Interactions and Less Positive Interactions each form individual adjacency graphs. These two adjacency graphs are utilized in dual-path positive graph learning. Negative Interactions help the model learn preference differences between interactions through a ranking loss.
  • Figure 3: A schematic overview of the proposed dual-path positive graph learning. The video features are processed through distinct paths corresponding to the adjacency matrices from the highly and less positive graphs, reaching the preference prediction layer. The user embedding and video embedding generated from the two paths are then mean-pooled and concatenated. The fused features are passed through the prediction layer to output the preference score of the user $u_i$ for the video $v_j$.
  • Figure 4: Comparison results between total interaction, highly positive only, and proposed dual-level graph.
  • Figure 5: Comparison results between BPR loss with unseen interactions and proposed BRP loss.