Pointwise Mutual Information as a Performance Gauge for Retrieval-Augmented Generation

Tianyu Liu; Jirui Qi; Paul He; Arianna Bisazza; Mrinmaya Sachan; Ryan Cotterell

Pointwise Mutual Information as a Performance Gauge for Retrieval-Augmented Generation

Tianyu Liu, Jirui Qi, Paul He, Arianna Bisazza, Mrinmaya Sachan, Ryan Cotterell

TL;DR

This work investigates how the order of retrieved documents in retrieval-augmented generation affects QA performance. It introduces Pointwise Mutual Information between a question and the context as an answer-agnostic gauge, showing strong corpus- and instance-level correlations with accuracy on NQ-Open and ELI5. Two practical prompt-ordering strategies are proposed: (i) selecting the permutation that maximizes PMI and (ii) a curvature-based method grounded in discrete convexity to induce a U-shaped PMI curve. Empirical results across multiple open LMs demonstrate performance gains, with efficiency advantages from avoiding LM decoding during permutation selection. The study also discusses model-tuning effects and realistic limitations, highlighting PMI-based prompt optimization as a promising direction for improving RAG systems in practice.

Abstract

Recent work suggests that large language models enhanced with retrieval-augmented generation are easily influenced by the order, in which the retrieved documents are presented to the model when solving tasks such as question answering (QA). However, there is no method to date that exploits this phenomenon to improve generation. We fill this gap. In this study, we show that the pointwise mutual information between a context and a question is an effective gauge for language model performance. Importantly, this gauge does not depend on knowing the answer to the question a priori. Through experiments on two question-answering datasets and a variety of large language models, we find evidence for an empirical correlation between answer accuracy and pointwise mutual information. Additionally, we propose two methods that use the pointwise mutual information between a document and a question as a gauge for selecting and constructing prompts that lead to better performance, whose effectiveness we demonstrate through experimentation.

Pointwise Mutual Information as a Performance Gauge for Retrieval-Augmented Generation

TL;DR

Abstract

Pointwise Mutual Information as a Performance Gauge for Retrieval-Augmented Generation

TL;DR

Abstract

Paper Structure

Table of Contents

Key Result

Figures (14)

Theorems & Definitions (6)