Measuring and Addressing Indexical Bias in Information Retrieval
Caleb Ziems, William Held, Jane Dwivedi-Yu, Diyi Yang
TL;DR
The paper tackles indexical bias in information retrieval by introducing the PAIR framework and the unsupervised Duo bias metric, which assesses how ranking order may skew user perspectives. Duo relies on a polarization axis learned from the Wiki-Balance corpora via PCA and computes a balance score as $ ext{Duo}(r) = 1 - ext{nDCG}(r, u_V)$ with $u_V(i,r) = \frac{1}{i} \sum_{j=1}^{i} (p_j - \bar{p})^2$, enabling automatic bias audits without labeled data. The authors validate Duo on synthetic and natural data, demonstrate a strong correlation with supervised bias metrics, and show that Duo reliably predicts SEME in a behavioral study when users click through biased results. They audit a range of IR systems, revealing trade-offs between relevance and indexical bias and highlighting domain-specific weaknesses, particularly in politics and environment. The work lays a foundation for automatic, scalable bias measurement and potential reranking strategies to mitigate indexical bias in IR and related rank-ordered information systems.
Abstract
Information Retrieval (IR) systems are designed to deliver relevant content, but traditional systems may not optimize rankings for fairness, neutrality, or the balance of ideas. Consequently, IR can often introduce indexical biases, or biases in the positional order of documents. Although indexical bias can demonstrably affect people's opinion, voting patterns, and other behaviors, these issues remain understudied as the field lacks reliable metrics and procedures for automatically measuring indexical bias. Towards this end, we introduce the PAIR framework, which supports automatic bias audits for ranked documents or entire IR systems. After introducing DUO, the first general-purpose automatic bias metric, we run an extensive evaluation of 8 IR systems on a new corpus of 32k synthetic and 4.7k natural documents, with 4k queries spanning 1.4k controversial issue topics. A human behavioral study validates our approach, showing that our bias metric can help predict when and how indexical bias will shift a reader's opinion.
