LEC: Linear Expectation Constraints for False-Discovery Control in Selective Prediction and Routing Systems

Zhiyuan Wang; Aniri; Tianlong Chen; Yue Zhang; Heng Tao Shen; Xiaoshuang Shi; Kaidi Xu

LEC: Linear Expectation Constraints for False-Discovery Control in Selective Prediction and Routing Systems

Zhiyuan Wang, Aniri, Tianlong Chen, Yue Zhang, Heng Tao Shen, Xiaoshuang Shi, Kaidi Xu

TL;DR

This work tackles the unreliability of LLM outputs by introducing Linear Expectation Constraints (LEC) to enforce false discovery rate (FDR) control in selective prediction. LEC reframes the problem as constrained decision-making and derives finite-sample conditions, computable from calibration data, to guarantee test-time FDR below a user-specified level while maximizing coverage. It extends naturally to two-model routing, maintaining a unified FDR guarantee while routing uncertain cases to a stronger model to boost efficiency. Empirically, LEC achieves tighter calibration and higher acceptance than prior confidence-bound methods across closed- and open-ended QA, and routing further improves correct acceptance beyond any single model.

Abstract

Large language models (LLMs) often generate unreliable answers, while heuristic uncertainty methods fail to fully distinguish correct from incorrect predictions, causing users to accept erroneous answers without statistical guarantees. We address this issue through the lens of false discovery rate (FDR) control, ensuring that among all accepted predictions, the proportion of errors does not exceed a target risk level. To achieve this in a principled way, we propose LEC, which reinterprets selective prediction as a constrained decision problem by enforcing a Linear Expectation Constraint over selection and error indicators. Then, we establish a finite-sample sufficient condition, which relies only on a held-out set of exchangeable calibration samples, to compute an FDR-constrained, coverage-maximizing threshold. Furthermore, we extend LEC to a two-model routing mechanism: given a prompt, if the current model's uncertainty exceeds its calibrated threshold, we delegate it to a stronger model, while maintaining a unified FDR guarantee. Evaluations on closed-ended and open-ended question-answering (QA) datasets show that LEC achieves tighter FDR control and substantially improves sample retention over prior methods. Moreover, the two-model routing mechanism achieves lower risk levels while accepting more correct samples than each individual model.

LEC: Linear Expectation Constraints for False-Discovery Control in Selective Prediction and Routing Systems

TL;DR

Abstract

LEC: Linear Expectation Constraints for False-Discovery Control in Selective Prediction and Routing Systems

TL;DR

Abstract

Paper Structure

Table of Contents

Key Result

Figures (11)

Theorems & Definitions (2)