Table of Contents
Fetching ...

To Recommend or Not: Recommendability Identification in Conversations with Pre-trained Language Models

Zhefan Wang, Weizhi Ma, Min Zhang

TL;DR

This work is the first to study recommendability before recommendation and provides preliminary ways to make it a fundamental component of the future recommendation system.

Abstract

Most current recommender systems primarily focus on what to recommend, assuming users always require personalized recommendations. However, with the widely spread of ChatGPT and other chatbots, a more crucial problem in the context of conversational systems is how to minimize user disruption when we provide recommendation services for users. While previous research has extensively explored different user intents in dialogue systems, fewer efforts are made to investigate whether recommendations should be provided. In this paper, we formally define the recommendability identification problem, which aims to determine whether recommendations are necessary in a specific scenario. First, we propose and define the recommendability identification task, which investigates the need for recommendations in the current conversational context. A new dataset is constructed. Subsequently, we discuss and evaluate the feasibility of leveraging pre-trained language models (PLMs) for recommendability identification. Finally, through comparative experiments, we demonstrate that directly employing PLMs with zero-shot results falls short of meeting the task requirements. Besides, fine-tuning or utilizing soft prompt techniques yields comparable results to traditional classification methods. Our work is the first to study recommendability before recommendation and provides preliminary ways to make it a fundamental component of the future recommendation system.

To Recommend or Not: Recommendability Identification in Conversations with Pre-trained Language Models

TL;DR

This work is the first to study recommendability before recommendation and provides preliminary ways to make it a fundamental component of the future recommendation system.

Abstract

Most current recommender systems primarily focus on what to recommend, assuming users always require personalized recommendations. However, with the widely spread of ChatGPT and other chatbots, a more crucial problem in the context of conversational systems is how to minimize user disruption when we provide recommendation services for users. While previous research has extensively explored different user intents in dialogue systems, fewer efforts are made to investigate whether recommendations should be provided. In this paper, we formally define the recommendability identification problem, which aims to determine whether recommendations are necessary in a specific scenario. First, we propose and define the recommendability identification task, which investigates the need for recommendations in the current conversational context. A new dataset is constructed. Subsequently, we discuss and evaluate the feasibility of leveraging pre-trained language models (PLMs) for recommendability identification. Finally, through comparative experiments, we demonstrate that directly employing PLMs with zero-shot results falls short of meeting the task requirements. Besides, fine-tuning or utilizing soft prompt techniques yields comparable results to traditional classification methods. Our work is the first to study recommendability before recommendation and provides preliminary ways to make it a fundamental component of the future recommendation system.
Paper Structure (27 sections, 4 equations, 4 figures, 6 tables)

This paper contains 27 sections, 4 equations, 4 figures, 6 tables.

Figures (4)

  • Figure 1: Flow comparison of the three methods. The red indicates the parameters involved in the training, and the blue indicates frozen parameters.
  • Figure 2: Some conversation examples from JDDCRec. Bold red indicates that there is a possibility of recommendation. Here, we distinguish between the existence and the strength of the recommendability.
  • Figure 3: Changes of 4 metrics under different sample numbers.
  • Figure 4: Performance comparison across different soft prompt lengths.