Table of Contents
Fetching ...

Why Pass@k Optimization Can Degrade Pass@1: Prompt Interference in LLM Post-training

Anas Barakat, Souradip Chakraborty, Khushbu Pahwa, Amrit Singh Bedi

TL;DR

It is shown that pass@$k$ policy gradients can conflict with pass@1 gradients because pass@$k$ optimization implicitly reweights prompts toward low-success prompts; when these prompts are what the authors term negatively interfering, their upweighting can rotate the pass@k update direction away from the pass@1 direction.

Abstract

Pass@k is a widely used performance metric for verifiable large language model tasks, including mathematical reasoning, code generation, and short-answer reasoning. It defines success if any of $k$ independently sampled solutions passes a verifier. This multi-sample inference metric has motivated inference-aware fine-tuning methods that directly optimize pass@$k$. However, prior work reports a recurring trade-off: pass@k improves while pass@1 degrades under such methods. This trade-off is practically important because pass@1 often remains a hard operational constraint due to latency and cost budgets, imperfect verifier coverage, and the need for a reliable single-shot fallback. We study the origin of this trade-off and provide a theoretical characterization of when pass@k policy optimization can reduce pass@1 through gradient conflict induced by prompt interference. We show that pass@$k$ policy gradients can conflict with pass@1 gradients because pass@$k$ optimization implicitly reweights prompts toward low-success prompts; when these prompts are what we term negatively interfering, their upweighting can rotate the pass@k update direction away from the pass@1 direction. We illustrate our theoretical findings with large language model experiments on verifiable mathematical reasoning tasks.

Why Pass@k Optimization Can Degrade Pass@1: Prompt Interference in LLM Post-training

TL;DR

It is shown that pass@ policy gradients can conflict with pass@1 gradients because pass@ optimization implicitly reweights prompts toward low-success prompts; when these prompts are what the authors term negatively interfering, their upweighting can rotate the pass@k update direction away from the pass@1 direction.

Abstract

Pass@k is a widely used performance metric for verifiable large language model tasks, including mathematical reasoning, code generation, and short-answer reasoning. It defines success if any of independently sampled solutions passes a verifier. This multi-sample inference metric has motivated inference-aware fine-tuning methods that directly optimize pass@. However, prior work reports a recurring trade-off: pass@k improves while pass@1 degrades under such methods. This trade-off is practically important because pass@1 often remains a hard operational constraint due to latency and cost budgets, imperfect verifier coverage, and the need for a reliable single-shot fallback. We study the origin of this trade-off and provide a theoretical characterization of when pass@k policy optimization can reduce pass@1 through gradient conflict induced by prompt interference. We show that pass@ policy gradients can conflict with pass@1 gradients because pass@ optimization implicitly reweights prompts toward low-success prompts; when these prompts are what we term negatively interfering, their upweighting can rotate the pass@k update direction away from the pass@1 direction. We illustrate our theoretical findings with large language model experiments on verifiable mathematical reasoning tasks.
Paper Structure (28 sections, 6 theorems, 58 equations, 8 figures)

This paper contains 28 sections, 6 theorems, 58 equations, 8 figures.

Key Result

Proposition 4.1

For any $k \geq 1$ and any $\theta \in \mathbb{R}^d$, In particular, $\langle \nabla J_k(\theta), \nabla J_1(\theta) \rangle < 0$ is equivalent to each one of the following conditions:

Figures (8)

  • Figure 1: (a) Empirical trade-off. Under pass@k policy optimization, pass@$k$ increases while pass@$1$ may decrease. We explain this empirically observed trade-off in (b) and (c), which schematically illustrate the pass@$1$ and pass@$k$ ($k>1$) gradients for three prompts and their expectations in policy-parameter space. (b) Pass@1 gradients with negatively interfering prompt. This panel shows a setting in which prompt $3$ is negatively interfering with prompts $1$ and $2$, i.e., the per-prompt pass@1 gradient for prompt $3$ has negative inner product with the per-prompt pass@$1$ gradients for prompts $1$ and $2$ (see Section \ref{['sec:prompt-interference']}). Here, $\nabla$pass@$1$ denotes the population pass@$1$ gradient, given by the average (expectation over prompts; here under a uniform distribution) of the per-prompt pass@$1$ gradients. (c) Pass@$k$ vs pass@1 gradient conflict. Per-prompt pass@k gradients are scaled versions of the corresponding per-prompt pass@1 gradients (Eq. \ref{['eq:pass@k-grad']}). This reweighting amplifies the magnitude of the pass@$k$ gradient for the negatively interfering prompt $3$, causing the resulting population pass@$k$ gradient to conflict with the population pass@$1$ gradient (their inner product becomes negative, corresponding to an obtuse angle, shown here as approximately $120^\circ$). Consequently, a policy update in the pass@$k$ gradient direction can increase pass@$k$ while decreasing pass@1.
  • Figure 2: Cosine kernel heatmap: $\cos(\nabla p_{\theta}(x), \nabla p_{\theta}(x'))$ for subsamples of prompts: 120 easy and 80 hard among a total of 6000 samples. Blue regions correspond to negative prompt interference.
  • Figure 3: Contour plots of pass@1 and pass@$k$ objectives in the policy parameter space. Gradients of pass@$k$ and pass@1 with respect to policy parameters are conflicting in the gray area.
  • Figure 4: Agreement score $a_{\theta}(x)$ (see \ref{['eq:alignment-score']}) on the toy example. The figure shows that the prompt distribution has both prompts with negative and positive agreement scores. Prompt with negative agreement scores are responsible for gradient conflict and hence pass@1 decrease. After pass@$k$ policy optimization, negative agreement scores get even more negative.
  • Figure 5: Pass@5 vs pass@1 in example of section \ref{['sec:toy-example']} with pass@5 policy optimization, 'pop' refers to 'population' i.e. pass@$k$ as defined in \ref{['eq:pass@k']}, 'easy' and 'hard' mean expectation is only taken over prompts labeled as easy respectively hard. Pass@5 increases while pass@1 decreases.
  • ...and 3 more figures

Theorems & Definitions (13)

  • Definition 3.1: Prompt interference
  • Proposition 4.1: Gradients conflict characterization
  • Remark 4.2
  • Corollary 4.4: Dominating negatively interfering prompts
  • Proposition 4.5: Influence of $k$
  • Proposition 4.6: Pass@$1$ decrease under pass@$k$ updates
  • Proposition B.1: No interference regime: No negative transfer $\implies$ no conflict
  • proof
  • proof
  • proof
  • ...and 3 more