EqualizeIR: Mitigating Linguistic Biases in Retrieval Models
Jiali Cheng, Hadi Amiri
TL;DR
The paper tackles linguistic biases in neural IR models that cause performance gaps across queries with different linguistic complexity. It proposes EqualizeIR, a two-stage framework that first trains a linguistically biased weak learner and then regularizes a robust model by fusing biased signals with robust predictions via $\log(z_D) = \sigma(\alpha \log(z_B) + \log(z_R))$, where $\alpha \in [0,1]$. Key contributions include quantifying linguistic complexity with 45 metrics, introducing four strategies to produce biased weak learners, and demonstrating improved average retrieval performance with reduced bias on BEIR benchmarks (implemented on DPR as a case study). This approach offers a practical path toward fairer and more reliable IR across diverse linguistic styles, with broad applicability to dense retrieval settings and beyond.
Abstract
This study finds that existing information retrieval (IR) models show significant biases based on the linguistic complexity of input queries, performing well on linguistically simpler (or more complex) queries while underperforming on linguistically more complex (or simpler) queries. To address this issue, we propose EqualizeIR, a framework to mitigate linguistic biases in IR models. EqualizeIR uses a linguistically biased weak learner to capture linguistic biases in IR datasets and then trains a robust model by regularizing and refining its predictions using the biased weak learner. This approach effectively prevents the robust model from overfitting to specific linguistic patterns in data. We propose four approaches for developing linguistically-biased models. Extensive experiments on several datasets show that our method reduces performance disparities across linguistically simple and complex queries, while improving overall retrieval performance.
