Table of Contents
Fetching ...

Query Augmentation by Decoding Semantics from Brain Signals

Ziyi Ye, Jingtao Zhan, Qingyao Ai, Yiqun Liu, Maarten de Rijke, Christina Lioma, Tuukka Ruotsalo

TL;DR

Brain-Aug addresses the challenge of semantically imprecise queries by decoding query-relevant semantics from fMRI brain signals and using them to generate a continuation of the user query via a prompt-tuned language model. The method aligns brain embeddings with text embeddings, constructs a unified prompt, and employs a next-token objective along with ranking-aware inference that incorporates IDF to improve document retrieval. Experiments on Pereira's, Huth's, and Narratives fMRI datasets show that brain-signal-informed query augmentation yields semantically closer query continuations and higher ranking performance, with the largest gains for ambiguous queries. The work demonstrates the feasibility of brain-signal-informed IR and highlights practical limitations and ethical considerations for real-world deployment.

Abstract

Query augmentation is a crucial technique for refining semantically imprecise queries. Traditionally, query augmentation relies on extracting information from initially retrieved, potentially relevant documents. If the quality of the initially retrieved documents is low, then the effectiveness of query augmentation would be limited as well. We propose Brain-Aug, which enhances a query by incorporating semantic information decoded from brain signals. BrainAug generates the continuation of the original query with a prompt constructed with brain signal information and a ranking-oriented inference approach. Experimental results on fMRI (functional magnetic resonance imaging) datasets show that Brain-Aug produces semantically more accurate queries, leading to improved document ranking performance. Such improvement brought by brain signals is particularly notable for ambiguous queries.

Query Augmentation by Decoding Semantics from Brain Signals

TL;DR

Brain-Aug addresses the challenge of semantically imprecise queries by decoding query-relevant semantics from fMRI brain signals and using them to generate a continuation of the user query via a prompt-tuned language model. The method aligns brain embeddings with text embeddings, constructs a unified prompt, and employs a next-token objective along with ranking-aware inference that incorporates IDF to improve document retrieval. Experiments on Pereira's, Huth's, and Narratives fMRI datasets show that brain-signal-informed query augmentation yields semantically closer query continuations and higher ranking performance, with the largest gains for ambiguous queries. The work demonstrates the feasibility of brain-signal-informed IR and highlights practical limitations and ethical considerations for real-world deployment.

Abstract

Query augmentation is a crucial technique for refining semantically imprecise queries. Traditionally, query augmentation relies on extracting information from initially retrieved, potentially relevant documents. If the quality of the initially retrieved documents is low, then the effectiveness of query augmentation would be limited as well. We propose Brain-Aug, which enhances a query by incorporating semantic information decoded from brain signals. BrainAug generates the continuation of the original query with a prompt constructed with brain signal information and a ranking-oriented inference approach. Experimental results on fMRI (functional magnetic resonance imaging) datasets show that Brain-Aug produces semantically more accurate queries, leading to improved document ranking performance. Such improvement brought by brain signals is particularly notable for ambiguous queries.
Paper Structure (31 sections, 10 equations, 3 figures, 7 tables)

This paper contains 31 sections, 10 equations, 3 figures, 7 tables.

Figures (3)

  • Figure 1: The procedure of query augmentation by decoding semantics from brain signals (Brain-Aug).
  • Figure 2: Relationship between document ranking performance and perplexity of ground-truth query continuation in Pereira's dataset. "RS B" indicates the ablation of Brain-Aug that randomizes brain inputs. $\Delta$ NDCG@20 indicates performance gains of Brain-Aug.
  • Figure 3: Document ranking performance w.r.t. different query features in Pereira's dataset.