Table of Contents
Fetching ...

IntentRec: Predicting User Session Intent with Hierarchical Multi-Task Learning

Sejoon Oh, Moumita Bhattacharya, Yesu Feng, Sudarshan Lamkhede

TL;DR

IntentRec tackles the problem of improving next-item recommendations by explicitly modeling a user’s session intent. It combines a Transformer-based intent encoder that produces an intent embedding sequence $\mathcal{Z}_k$ from multiple intent heads with a separate Transformer-based next-item predictor, all trained under a hierarchical loss $\mathcal{L}_{IntentRec} = \mathcal{L}_{item} + \lambda \sum_{i=1}^{M} \mathcal{L}_{intent_i}$ and using duration-based weights. Key contributions include the hierarchical multi-task learning design that couples intent and item predictions, the integration of short-term and long-term user interests, and strong empirical gains on Netflix data accompanied by qualitative analyses of intent clusters and attention weights. The approach is domain-agnostic and adaptable to other platforms by redefining intent labels and implicit signals, enabling downstream applications such as UI optimization and real-time personalization, with future work focusing on scalable training, online updates, and integration of large language models for richer intent inference.

Abstract

Recommender systems have played a critical role in diverse digital services such as e-commerce, streaming media, social networks, etc. If we know what a user's intent is in a given session (e.g. do they want to watch short videos or a movie or play games; are they shopping for a camping trip), it becomes easier to provide high-quality recommendations. In this paper, we introduce IntentRec, a novel recommendation framework based on hierarchical multi-task neural network architecture that tries to estimate a user's latent intent using their short- and long-term implicit signals as proxies and uses the intent prediction to predict the next item user is likely to engage with. By directly leveraging the intent prediction, we can offer accurate and personalized recommendations to users. Our comprehensive experiments on Netflix user engagement data show that IntentRec outperforms the state-of-the-art next-item and next-intent predictors. We also share several findings and downstream applications of IntentRec.

IntentRec: Predicting User Session Intent with Hierarchical Multi-Task Learning

TL;DR

IntentRec tackles the problem of improving next-item recommendations by explicitly modeling a user’s session intent. It combines a Transformer-based intent encoder that produces an intent embedding sequence from multiple intent heads with a separate Transformer-based next-item predictor, all trained under a hierarchical loss and using duration-based weights. Key contributions include the hierarchical multi-task learning design that couples intent and item predictions, the integration of short-term and long-term user interests, and strong empirical gains on Netflix data accompanied by qualitative analyses of intent clusters and attention weights. The approach is domain-agnostic and adaptable to other platforms by redefining intent labels and implicit signals, enabling downstream applications such as UI optimization and real-time personalization, with future work focusing on scalable training, online updates, and integration of large language models for richer intent inference.

Abstract

Recommender systems have played a critical role in diverse digital services such as e-commerce, streaming media, social networks, etc. If we know what a user's intent is in a given session (e.g. do they want to watch short videos or a movie or play games; are they shopping for a camping trip), it becomes easier to provide high-quality recommendations. In this paper, we introduce IntentRec, a novel recommendation framework based on hierarchical multi-task neural network architecture that tries to estimate a user's latent intent using their short- and long-term implicit signals as proxies and uses the intent prediction to predict the next item user is likely to engage with. By directly leveraging the intent prediction, we can offer accurate and personalized recommendations to users. Our comprehensive experiments on Netflix user engagement data show that IntentRec outperforms the state-of-the-art next-item and next-intent predictors. We also share several findings and downstream applications of IntentRec.
Paper Structure (18 sections, 4 equations, 7 figures, 3 tables)

This paper contains 18 sections, 4 equations, 7 figures, 3 tables.

Figures (7)

  • Figure 1: Overview of user engagement data in Netflix. User intent can be associated with several interaction metadata. We leverage various implicit signals to predict user intent and next-item.
  • Figure 2: An architectural illustration of our hierarchical multi-task learning model IntentRec for user intent and item predictions.
  • Figure 3: Given a high-level interaction sequence of a user, an input feature sequence is constructed by a concatenation of an interaction feature sequence and a short-term interest feature sequence.
  • Figure 4: Given an input feature sequence of a user, a user intent embedding sequence is constructed by an attention-based aggregation of auxiliary prediction (e.g., Action Type and Genre) results. We use ground-truth intent and item-ID labels to optimize predictions.
  • Figure 5: Given an intent-aware feature sequence of a user, a next-item prediction vector for each sequence position is found by a Transformer encoder and a fully-connected layer.
  • ...and 2 more figures