Table of Contents
Fetching ...

BoolQuestions: Does Dense Retrieval Understand Boolean Logic in Language?

Zongmeng Zhang, Jinhua Zhu, Wengang Zhou, Xiang Qi, Peng Zhang, Houqiang Li

TL;DR

The conclusion is drawn that current dense retrieval systems do not fully understand Boolean logic in language, and there is a long way to go to improve the authors' dense retrieval systems.

Abstract

Dense retrieval, which aims to encode the semantic information of arbitrary text into dense vector representations or embeddings, has emerged as an effective and efficient paradigm for text retrieval, consequently becoming an essential component in various natural language processing systems. These systems typically focus on optimizing the embedding space by attending to the relevance of text pairs, while overlooking the Boolean logic inherent in language, which may not be captured by current training objectives. In this work, we first investigate whether current retrieval systems can comprehend the Boolean logic implied in language. To answer this question, we formulate the task of Boolean Dense Retrieval and collect a benchmark dataset, BoolQuestions, which covers complex queries containing basic Boolean logic and corresponding annotated passages. Through extensive experimental results on the proposed task and benchmark dataset, we draw the conclusion that current dense retrieval systems do not fully understand Boolean logic in language, and there is a long way to go to improve our dense retrieval systems. Furthermore, to promote further research on enhancing the understanding of Boolean logic for language models, we explore Boolean operation on decomposed query and propose a contrastive continual training method that serves as a strong baseline for the research community.

BoolQuestions: Does Dense Retrieval Understand Boolean Logic in Language?

TL;DR

The conclusion is drawn that current dense retrieval systems do not fully understand Boolean logic in language, and there is a long way to go to improve the authors' dense retrieval systems.

Abstract

Dense retrieval, which aims to encode the semantic information of arbitrary text into dense vector representations or embeddings, has emerged as an effective and efficient paradigm for text retrieval, consequently becoming an essential component in various natural language processing systems. These systems typically focus on optimizing the embedding space by attending to the relevance of text pairs, while overlooking the Boolean logic inherent in language, which may not be captured by current training objectives. In this work, we first investigate whether current retrieval systems can comprehend the Boolean logic implied in language. To answer this question, we formulate the task of Boolean Dense Retrieval and collect a benchmark dataset, BoolQuestions, which covers complex queries containing basic Boolean logic and corresponding annotated passages. Through extensive experimental results on the proposed task and benchmark dataset, we draw the conclusion that current dense retrieval systems do not fully understand Boolean logic in language, and there is a long way to go to improve our dense retrieval systems. Furthermore, to promote further research on enhancing the understanding of Boolean logic for language models, we explore Boolean operation on decomposed query and propose a contrastive continual training method that serves as a strong baseline for the research community.

Paper Structure

This paper contains 36 sections, 5 equations, 3 figures, 5 tables.

Figures (3)

  • Figure 1: Examples of the AND, OR and NOT question in BoolQuestions and questions from MS MARCO and Natural Questions. Questions in BoolQuestions are more complex to understand than those in MS MARCO and Natural Questions.
  • Figure 2: Data collection pipeline of BoolQuestions.
  • Figure 3: Question types covered in BoolQuestions. Questions are firstly grouped by the Boolean logic implied in questions and then heuristically categorized following the method in yang-etal-2018-hotpotqa. Colored blocks without labels indicate questions whose types can not be determined.