Do Images Clarify? A Study on the Effect of Images on Clarifying Questions in Conversational Search

Clemencia Siro; Zahra Abbasiantaeb; Yifei Yuan; Mohammad Aliannejadi; Maarten de Rijke

Do Images Clarify? A Study on the Effect of Images on Clarifying Questions in Conversational Search

Clemencia Siro, Zahra Abbasiantaeb, Yifei Yuan, Mohammad Aliannejadi, Maarten de Rijke

TL;DR

This study investigates how images augment clarifying questions in conversational search and examines effects on two core tasks: answering clarifying questions and query reformulation. Using a within-subject design across 30 topics with and without images, the authors measure user experience and retrieval outcomes, finding that images are highly preferred but yield mixed retrieval benefits depending on the task. Images improve reformulation quality and top-level retrieval in reformulation tasks, while for direct answer tasks, text-only clarifications can produce stronger retrieval signals in some setups. The results highlight task- and user-dependent effects, arguing for adaptive, expertise-aware deployment of visual context to optimize multimodal conversational search systems.

Abstract

Conversational search systems increasingly employ clarifying questions to refine user queries and improve the search experience. Previous studies have demonstrated the usefulness of text-based clarifying questions in enhancing both retrieval performance and user experience. While images have been shown to improve retrieval performance in various contexts, their impact on user performance when incorporated into clarifying questions remains largely unexplored. We conduct a user study with 73 participants to investigate the role of images in conversational search, specifically examining their effects on two search-related tasks: (i) answering clarifying questions and (ii) query reformulation. We compare the effect of multimodal and text-only clarifying questions in both tasks within a conversational search context from various perspectives. Our findings reveal that while participants showed a strong preference for multimodal questions when answering clarifying questions, preferences were more balanced in the query reformulation task. The impact of images varied with both task type and user expertise. In answering clarifying questions, images helped maintain engagement across different expertise levels, while in query reformulation they led to more precise queries and improved retrieval performance. Interestingly, for clarifying question answering, text-only setups demonstrated better user performance as they provided more comprehensive textual information in the absence of images. These results provide valuable insights for designing effective multimodal conversational search systems, highlighting that the benefits of visual augmentation are task-dependent and should be strategically implemented based on the specific search context and user characteristics.

Do Images Clarify? A Study on the Effect of Images on Clarifying Questions in Conversational Search

TL;DR

Abstract

Paper Structure (38 sections, 10 figures, 9 tables)

This paper contains 38 sections, 10 figures, 9 tables.

Introduction
Related Work
User intent clarification
Multimodal IR
User aspects
Study Design
Topic selection and pre-study analysis
Study setup
Task structure
Tasks
Questionnaire design
Procedure
Search task distribution
Quality assurance
Data collection
...and 23 more sections

Figures (10)

Figure 1: Two example conversations with text-only clarifying questions, representing the existing systems (on the left), as well as multimodal clarifying questions, representing our proposed system setup (on the right).
Figure 2: Our user study procedure. "Main Study" refers to either query reformulation or answering the clarifying question tasks.
Figure 3: Average time taken by participants to complete each question (a) and the average length of the clarifying question and reformulated queries (b).
Figure 4: Rating distributions of main task aspects as rated by participants in Task 1 -- row 1 and Task 2 -- row 2.
Figure 5: Information sources relied on by participants when a) answering clarifying questions and b) reformulating queries.
...and 5 more figures

Do Images Clarify? A Study on the Effect of Images on Clarifying Questions in Conversational Search

TL;DR

Abstract

Do Images Clarify? A Study on the Effect of Images on Clarifying Questions in Conversational Search

Authors

TL;DR

Abstract

Table of Contents

Figures (10)