Pareto Optimal Learning for Estimating Large Language Model Errors
Theodore Zhao, Mu Wei, J. Samuel Preston, Hoifung Poon
TL;DR
The paper tackles the challenge of quantifying and reducing errors in large language models by proposing Pareto Optimal Learning (POLAR), which jointly learns a probabilistic harmonizer $h$ that aligns the LLM output with multiple external information sources. The POLAR score, $\\zeta(x, \\Lambda(x); h^*) = 1 - h^*(x)[\\Lambda(x)]$, provides a calibrated estimate of the true error probability and enables two dynamic prompting strategies: self-verification and POLAR-assisted RAG. Empirical results across biomedical and general NLP tasks show POLAR achieves well-calibrated error estimates (low ECE, high $R^2$) and, when combined with dynamic prompting, can surpass state-of-the-art task-specific models. The framework highlights the practical impact of integrating heterogeneous knowledge sources with LLMs to both quantify and reduce errors in real-world applications, including disease prediction and QA.
Abstract
Large Language Models (LLMs) have shown impressive abilities in many applications. When a concrete and precise answer is desired, it is important to have a quantitative estimation of the potential error rate. However, this can be challenging due to the text-in-text-out nature of generative models. We present a method based on Pareto optimization that generates a risk score to estimate the probability of error in an LLM response by integrating multiple sources of information. We prove theoretically that the error estimator optimized in our framework aligns with the LLM and the information sources in an Pareto optimal manner. Experimental results show that the risk scores estimated by our method are well correlated with the true LLM error rate, thus facilitating error correction. By dynamically combining with prompting strategies such as self-verification and information retrieval, we demonstrate the proposed method can be utilized to increase the performance of an LLM, surpassing state-of-the-art task specific models.
