ExPairT-LLM: Exact Learning for LLM Code Selection by Pairwise Queries
Tom Yuviler, Dana Drachsler-Cohen
TL;DR
ExPairT-LLM reframes code selection as an exact learning problem, introducing pairwise membership and pairwise equivalence queries to an LLM oracle and solving the task via a Copeland-style tournament over candidate program clusters. The method iteratively refines the candidate set by distinguishing outputs on targeted inputs and verifying differentiating inputs through execution, achieving robustness to some LLM mistakes. The authors provide termination guarantees and probabilistic bounds under imperfect oracle behavior, and demonstrate consistent pass@1 improvements over state-of-the-art baselines across multiple benchmarks, including gains for LLMs with complex reasoning. The approach also highlights the practical feasibility of query-bounded operation and offers concrete LLM prompts for both query types, contributing a principled, interactive framework for robust code generation selection.
Abstract
Despite recent advances in LLMs, the task of code generation is still challenging. To cope, code selection algorithms select the best program from multiple programs generated by an LLM. However, existing algorithms can fail to identify the correct program, either because they can misidentify nonequivalent programs or because they rely on an LLM and assume it always correctly determines the output for every input. We present ExPairT-LLM, an exact learning algorithm for code selection that selects a program by posing to an LLM oracle two new types of queries: pairwise membership and pairwise equivalence. These queries are simpler for LLMs and enable ExPairT-LLM to identify the correct program through a tournament, which is robust to some LLM mistakes. We evaluate ExPairT-LLM on four popular code datasets. Its pass@1 (success rate) outperforms the state-of-the-art code selection algorithm on average by +13.0% and up to +27.1%. It also improves the pass@1 of LLMs performing complex reasoning by +24.0%.
