Table of Contents
Fetching ...

UniSAR: Modeling User Transition Behaviors between Search and Recommendation

Teng Shi, Zihua Si, Jun Xu, Xiao Zhang, Xiaoxue Zang, Kai Zheng, Dewei Leng, Yanan Niu, Yang Song

TL;DR

UniSAR addresses the gap in modeling fine-grained user transitions between search and recommendation by explicitly extracting four transition types with a mask-enabled transformer, aligning transitions via contrastive learning, and fusing them through cross-attention. The approach is trained jointly on S&R data using an MMoE architecture, enabling mutually beneficial improvements in both tasks. Empirical results on KuaiSAR and Amazon Kindle Store demonstrate state-of-the-art performance and validate the effectiveness of each component, including transition extraction, alignment, fusion, and joint training. The work provides a scalable, unified S&R framework with strong interpretability of user transition dynamics and practical impact for platforms offering both services.

Abstract

Nowadays, many platforms provide users with both search and recommendation services as important tools for accessing information. The phenomenon has led to a correlation between user search and recommendation behaviors, providing an opportunity to model user interests in a fine-grained way. Existing approaches either model user search and recommendation behaviors separately or overlook the different transitions between user search and recommendation behaviors. In this paper, we propose a framework named UniSAR that effectively models the different types of fine-grained behavior transitions for providing users a Unified Search And Recommendation service. Specifically, UniSAR models the user transition behaviors between search and recommendation through three steps: extraction, alignment, and fusion, which are respectively implemented by transformers equipped with pre-defined masks, contrastive learning that aligns the extracted fine-grained user transitions, and cross-attentions that fuse different transitions. To provide users with a unified service, the learned representations are fed into the downstream search and recommendation models. Joint learning on both search and recommendation data is employed to utilize the knowledge and enhance each other. Experimental results on two public datasets demonstrated the effectiveness of UniSAR in terms of enhancing both search and recommendation simultaneously. The experimental analysis further validates that UniSAR enhances the results by successfully modeling the user transition behaviors between search and recommendation.

UniSAR: Modeling User Transition Behaviors between Search and Recommendation

TL;DR

UniSAR addresses the gap in modeling fine-grained user transitions between search and recommendation by explicitly extracting four transition types with a mask-enabled transformer, aligning transitions via contrastive learning, and fusing them through cross-attention. The approach is trained jointly on S&R data using an MMoE architecture, enabling mutually beneficial improvements in both tasks. Empirical results on KuaiSAR and Amazon Kindle Store demonstrate state-of-the-art performance and validate the effectiveness of each component, including transition extraction, alignment, fusion, and joint training. The work provides a scalable, unified S&R framework with strong interpretability of user transition dynamics and practical impact for platforms offering both services.

Abstract

Nowadays, many platforms provide users with both search and recommendation services as important tools for accessing information. The phenomenon has led to a correlation between user search and recommendation behaviors, providing an opportunity to model user interests in a fine-grained way. Existing approaches either model user search and recommendation behaviors separately or overlook the different transitions between user search and recommendation behaviors. In this paper, we propose a framework named UniSAR that effectively models the different types of fine-grained behavior transitions for providing users a Unified Search And Recommendation service. Specifically, UniSAR models the user transition behaviors between search and recommendation through three steps: extraction, alignment, and fusion, which are respectively implemented by transformers equipped with pre-defined masks, contrastive learning that aligns the extracted fine-grained user transitions, and cross-attentions that fuse different transitions. To provide users with a unified service, the learned representations are fed into the downstream search and recommendation models. Joint learning on both search and recommendation data is employed to utilize the knowledge and enhance each other. Experimental results on two public datasets demonstrated the effectiveness of UniSAR in terms of enhancing both search and recommendation simultaneously. The experimental analysis further validates that UniSAR enhances the results by successfully modeling the user transition behaviors between search and recommendation.
Paper Structure (28 sections, 17 equations, 8 figures, 5 tables)

This paper contains 28 sections, 17 equations, 8 figures, 5 tables.

Figures (8)

  • Figure 1: An example of the user search and recommendation history in an App with both search and recommendation services. The user exhibits four types of transition behaviors, as shown by four arrows.
  • Figure 2: The percentage of the immediately preceding clicked item correlates with the current item.
  • Figure 3: The architecture of UniSAR. (a) UniSAR model workflow; (b) Implementation of the prediction layer; (c) The user transition modeling module.
  • Figure 4: The attention computation methods differ across various MSA (Multi-head Self-Attention) modules within UniSAR. In $\mathrm{MSA}_r$ or $\mathrm{MSA}_s$, attention is specifically computed within recommendation or search behaviors to extract r2r or s2s, respectively, as detailed in Eq. \ref{['eq:MSA_r_s']}; $\mathrm{MSA}_m$ computes attention exclusively between different behaviors, outlined in Eq. \ref{['eq:MSA_m']}, thereby extracting r2s and s2r.
  • Figure 5: The histogram of the cosine similarities between the hidden representations, with and without introducing $\mathcal{L}_{\mathrm{Align}}$.
  • ...and 3 more figures