The Wisdom of Many Queries: Complexity-Diversity Principle for Dense Retriever Training
Xincan Feng, Noriki Nishida, Yusuke Sakai, Yuji Matsumoto
TL;DR
This work tackles contradictory findings on query diversity in synthetic training for dense retrieval by introducing quality-diversity (Q-D) metrics and the Complexity-Diversity Principle (CDP), which links query complexity to the value of diversity. It proposes zero-shot multi-query synthesis to generate M diverse queries per document using prompt-based prompts with controlled diversity, guided by Q-D metrics, and demonstrates that diversity yields the most gains on reasoning-intensive, multi-hop tasks. Across 31 datasets and four benchmark families, the approach achieves state-of-the-art performance on multi-hop retrieval while revealing a robust, data-efficient trade-off: high-complexity tasks benefit from diversity (CW $>10$), simple tasks may degrade with excessive diversity (CW $<7$). The study provides actionable guidelines for when to apply diversity, highlights diversity as a regularizer, and discusses cost-efficient configurations, with implications for practical dense retriever training and broader generalization.
Abstract
Prior work reports conflicting results on query diversity in synthetic data generation for dense retrieval. We identify this conflict and design Q-D metrics to quantify diversity's impact, making the problem measurable. Through experiments on 4 benchmark types (31 datasets), we find query diversity especially benefits multi-hop retrieval. Deep analysis on multi-hop data reveals that diversity benefit correlates strongly with query complexity ($r$$\geq$0.95, $p$$<$0.05 in 12/14 conditions), measured by content words (CW). We formalize this as the Complexity-Diversity Principle (CDP): query complexity determines optimal diversity. CDP provides actionable thresholds (CW$>$10: use diversity; CW$<$7: avoid it). Guided by CDP, we propose zero-shot multi-query synthesis for multi-hop tasks, achieving state-of-the-art performance.
