Table of Contents
Fetching ...

ExPairT-LLM: Exact Learning for LLM Code Selection by Pairwise Queries

Tom Yuviler, Dana Drachsler-Cohen

TL;DR

ExPairT-LLM reframes code selection as an exact learning problem, introducing pairwise membership and pairwise equivalence queries to an LLM oracle and solving the task via a Copeland-style tournament over candidate program clusters. The method iteratively refines the candidate set by distinguishing outputs on targeted inputs and verifying differentiating inputs through execution, achieving robustness to some LLM mistakes. The authors provide termination guarantees and probabilistic bounds under imperfect oracle behavior, and demonstrate consistent pass@1 improvements over state-of-the-art baselines across multiple benchmarks, including gains for LLMs with complex reasoning. The approach also highlights the practical feasibility of query-bounded operation and offers concrete LLM prompts for both query types, contributing a principled, interactive framework for robust code generation selection.

Abstract

Despite recent advances in LLMs, the task of code generation is still challenging. To cope, code selection algorithms select the best program from multiple programs generated by an LLM. However, existing algorithms can fail to identify the correct program, either because they can misidentify nonequivalent programs or because they rely on an LLM and assume it always correctly determines the output for every input. We present ExPairT-LLM, an exact learning algorithm for code selection that selects a program by posing to an LLM oracle two new types of queries: pairwise membership and pairwise equivalence. These queries are simpler for LLMs and enable ExPairT-LLM to identify the correct program through a tournament, which is robust to some LLM mistakes. We evaluate ExPairT-LLM on four popular code datasets. Its pass@1 (success rate) outperforms the state-of-the-art code selection algorithm on average by +13.0% and up to +27.1%. It also improves the pass@1 of LLMs performing complex reasoning by +24.0%.

ExPairT-LLM: Exact Learning for LLM Code Selection by Pairwise Queries

TL;DR

ExPairT-LLM reframes code selection as an exact learning problem, introducing pairwise membership and pairwise equivalence queries to an LLM oracle and solving the task via a Copeland-style tournament over candidate program clusters. The method iteratively refines the candidate set by distinguishing outputs on targeted inputs and verifying differentiating inputs through execution, achieving robustness to some LLM mistakes. The authors provide termination guarantees and probabilistic bounds under imperfect oracle behavior, and demonstrate consistent pass@1 improvements over state-of-the-art baselines across multiple benchmarks, including gains for LLMs with complex reasoning. The approach also highlights the practical feasibility of query-bounded operation and offers concrete LLM prompts for both query types, contributing a principled, interactive framework for robust code generation selection.

Abstract

Despite recent advances in LLMs, the task of code generation is still challenging. To cope, code selection algorithms select the best program from multiple programs generated by an LLM. However, existing algorithms can fail to identify the correct program, either because they can misidentify nonequivalent programs or because they rely on an LLM and assume it always correctly determines the output for every input. We present ExPairT-LLM, an exact learning algorithm for code selection that selects a program by posing to an LLM oracle two new types of queries: pairwise membership and pairwise equivalence. These queries are simpler for LLMs and enable ExPairT-LLM to identify the correct program through a tournament, which is robust to some LLM mistakes. We evaluate ExPairT-LLM on four popular code datasets. Its pass@1 (success rate) outperforms the state-of-the-art code selection algorithm on average by +13.0% and up to +27.1%. It also improves the pass@1 of LLMs performing complex reasoning by +24.0%.

Paper Structure

This paper contains 28 sections, 14 theorems, 10 equations, 8 figures, 3 tables, 1 algorithm.

Key Result

Lemma 0

alg:selecting terminates within $|\mathcal{P}|$ iterations.

Figures (8)

  • Figure 1: ExPairT-LLM: A code selection algorithm by pairwise queries.
  • Figure 2: Tasks from (a) MBPP-sanitizedAustin21, and (b) APPShendrycks2021measuring.
  • Figure 3: A running example of ExPairT-LLM.
  • Figure 4: The empirical probability (dashed) and our lower bound (solid) for different $p$.
  • Figure 5: The probability of finding a differentiating input as a function of the number of incorrect programs $d$.
  • ...and 3 more figures

Theorems & Definitions (21)

  • Lemma 0
  • Lemma 0
  • Lemma 0
  • Theorem 1
  • Theorem 2
  • Theorem 3
  • Theorem 4
  • Lemma 4
  • proof
  • Lemma 4
  • ...and 11 more