ProductAgent: Benchmarking Conversational Product Search Agent with Asking Clarification Questions
Jingheng Ye, Yong Jiang, Xiaobin Wang, Yinghui Li, Yangning Li, Hai-Tao Zheng, Pengjun Xie, Fei Huang
TL;DR
The paper defines the task of product demand clarification in e-commerce and introduces ProductAgent, a memory-enabled, tool-augmented agent that collaborates with a ProClare benchmark to perform interactive product search. It presents a three-component architecture (databases, memory, tools) and a three-step turn cycle (Category Analysis, Item Search, Clarification Question Generation) to progressively refine user requirements through dynamic statistics and targeted questions. The ProClare benchmark, built on a large AliMe KG-derived document set and evaluated with a LLM-driven user simulator via LlamaIndex, enables automatic evaluation of traditional and conversational retrieval settings, revealing distinct strengths of dense retrievers versus BM25 depending on the setting and the benefit of reranking. The work also analyzes ablations and failure modes, identifying the pivotal role of dynamic statistics and clarifying questions while acknowledging limitations such as simulated users, dataset scope, and prompt sensitivity, guiding future improvements in controllability and evaluation realism.
Abstract
This paper introduces the task of product demand clarification within an e-commercial scenario, where the user commences the conversation with ambiguous queries and the task-oriented agent is designed to achieve more accurate and tailored product searching by asking clarification questions. To address this task, we propose ProductAgent, a conversational information seeking agent equipped with abilities of strategic clarification question generation and dynamic product retrieval. Specifically, we develop the agent with strategies for product feature summarization, query generation, and product retrieval. Furthermore, we propose the benchmark called PROCLARE to evaluate the agent's performance both automatically and qualitatively with the aid of a LLM-driven user simulator. Experiments show that ProductAgent interacts positively with the user and enhances retrieval performance with increasing dialogue turns, where user demands become gradually more explicit and detailed. All the source codes will be released after the review anonymity period.
