Table of Contents
Fetching ...

Generating Multi-Aspect Queries for Conversational Search

Zahra Abbasiantaeb, Simon Lupart, Mohammad Aliannejadi

TL;DR

This work hypothesizes that breaking down the information of an utterance into multi-aspect rewritten queries can lead to more effective retrieval performance, and proposes a multi-aspect query generation and retrieval framework, called MQ4CS, which outperforms the state-of-the-art query rewriting methods.

Abstract

Conversational information seeking (CIS) systems aim to model the user's information need within the conversational context and retrieve the relevant information. One major approach to modeling the conversational context aims to rewrite the user utterance in the conversation to represent the information need independently. Recent work has shown the benefit of expanding the rewritten utterance with relevant terms. In this work, we hypothesize that breaking down the information of an utterance into multi-aspect rewritten queries can lead to more effective retrieval performance. This is more evident in more complex utterances that require gathering evidence from various information sources, where a single query rewrite or query representation cannot capture the complexity of the utterance. To test this hypothesis, we conduct extensive experiments on five widely used CIS datasets where we leverage LLMs to generate multi-aspect queries to represent the information need for each utterance in multiple query rewrites. We show that, for most of the utterances, the same retrieval model would perform better with more than one rewritten query by 85% in terms of nDCG@3. We further propose a multi-aspect query generation and retrieval framework, called MQ4CS. Our extensive experiments show that MQ4CS outperforms the state-of-the-art query rewriting methods. We make our code and our new dataset of generated multi-aspect queries publicly available.

Generating Multi-Aspect Queries for Conversational Search

TL;DR

This work hypothesizes that breaking down the information of an utterance into multi-aspect rewritten queries can lead to more effective retrieval performance, and proposes a multi-aspect query generation and retrieval framework, called MQ4CS, which outperforms the state-of-the-art query rewriting methods.

Abstract

Conversational information seeking (CIS) systems aim to model the user's information need within the conversational context and retrieve the relevant information. One major approach to modeling the conversational context aims to rewrite the user utterance in the conversation to represent the information need independently. Recent work has shown the benefit of expanding the rewritten utterance with relevant terms. In this work, we hypothesize that breaking down the information of an utterance into multi-aspect rewritten queries can lead to more effective retrieval performance. This is more evident in more complex utterances that require gathering evidence from various information sources, where a single query rewrite or query representation cannot capture the complexity of the utterance. To test this hypothesis, we conduct extensive experiments on five widely used CIS datasets where we leverage LLMs to generate multi-aspect queries to represent the information need for each utterance in multiple query rewrites. We show that, for most of the utterances, the same retrieval model would perform better with more than one rewritten query by 85% in terms of nDCG@3. We further propose a multi-aspect query generation and retrieval framework, called MQ4CS. Our extensive experiments show that MQ4CS outperforms the state-of-the-art query rewriting methods. We make our code and our new dataset of generated multi-aspect queries publicly available.
Paper Structure (17 sections, 4 equations, 7 figures, 16 tables)

This paper contains 17 sections, 4 equations, 7 figures, 16 tables.

Figures (7)

  • Figure 1: An example conversation with a complex user utterance. The system needs to generate three distinct queries and search for every query.
  • Figure 2: A high-level overview of the proposed framework, compared with existing models. In QR a single query is generated by LLM and in LLM4CS multiple LLM calls are made to generate different query rewrites. In our MQ4CS and MQ4CS$_\text{ans}$ models, we generate multi-aspect queries in a single prompt. We then perform retrieval on each query independently to avoid information loss.
  • Figure 3: Distribution of the turns with the corresponding $\phi^*$ is shown. The value of $\phi^*$ is selected based on the nDCG@3 metric.
  • Figure 4: A comparison between retrieval performance of proposed MQ4CS (with $\phi$=5) and baselines over complex and easy turns.
  • Figure 5: Performance of MQ4CS compared with the baselines on TopiOCQA, broken into the turns with and without topic shift.
  • ...and 2 more figures