Pareto Optimal Learning for Estimating Large Language Model Errors

Theodore Zhao; Mu Wei; J. Samuel Preston; Hoifung Poon

Pareto Optimal Learning for Estimating Large Language Model Errors

Theodore Zhao, Mu Wei, J. Samuel Preston, Hoifung Poon

TL;DR

The paper tackles the challenge of quantifying and reducing errors in large language models by proposing Pareto Optimal Learning (POLAR), which jointly learns a probabilistic harmonizer $h$ that aligns the LLM output with multiple external information sources. The POLAR score, $\\zeta(x, \\Lambda(x); h^*) = 1 - h^*(x)[\\Lambda(x)]$, provides a calibrated estimate of the true error probability and enables two dynamic prompting strategies: self-verification and POLAR-assisted RAG. Empirical results across biomedical and general NLP tasks show POLAR achieves well-calibrated error estimates (low ECE, high $R^2$) and, when combined with dynamic prompting, can surpass state-of-the-art task-specific models. The framework highlights the practical impact of integrating heterogeneous knowledge sources with LLMs to both quantify and reduce errors in real-world applications, including disease prediction and QA.

Abstract

Large Language Models (LLMs) have shown impressive abilities in many applications. When a concrete and precise answer is desired, it is important to have a quantitative estimation of the potential error rate. However, this can be challenging due to the text-in-text-out nature of generative models. We present a method based on Pareto optimization that generates a risk score to estimate the probability of error in an LLM response by integrating multiple sources of information. We prove theoretically that the error estimator optimized in our framework aligns with the LLM and the information sources in an Pareto optimal manner. Experimental results show that the risk scores estimated by our method are well correlated with the true LLM error rate, thus facilitating error correction. By dynamically combining with prompting strategies such as self-verification and information retrieval, we demonstrate the proposed method can be utilized to increase the performance of an LLM, surpassing state-of-the-art task specific models.

Pareto Optimal Learning for Estimating Large Language Model Errors

TL;DR

The paper tackles the challenge of quantifying and reducing errors in large language models by proposing Pareto Optimal Learning (POLAR), which jointly learns a probabilistic harmonizer

that aligns the LLM output with multiple external information sources. The POLAR score,

, provides a calibrated estimate of the true error probability and enables two dynamic prompting strategies: self-verification and POLAR-assisted RAG. Empirical results across biomedical and general NLP tasks show POLAR achieves well-calibrated error estimates (low ECE, high

) and, when combined with dynamic prompting, can surpass state-of-the-art task-specific models. The framework highlights the practical impact of integrating heterogeneous knowledge sources with LLMs to both quantify and reduce errors in real-world applications, including disease prediction and QA.

Abstract

Paper Structure (41 sections, 1 theorem, 23 equations, 3 figures, 3 tables, 2 algorithms)

This paper contains 41 sections, 1 theorem, 23 equations, 3 figures, 3 tables, 2 algorithms.

Introduction
Related Work
Methodology
Problem setup
Pareto Optimal Learning Assessed Risk
Step 1: Pareto Optimal Learning
Step 2: POLAR Score Estimation
Step 3*: Error Correction with POLAR
Dynamic self-verification
POLAR-assisted RAG
Experiments
Dataset
Prompt design
Information sources
Optimization
...and 26 more sections

Key Result

Theorem 1

Suppose $G$ is a Pareto aggregator as in Definition def-scale, solving the problem in Equation eq-pareto approximates a Pareto optimum by minimizing the upperbound.

Figures (3)

Figure 1: Pareto optimal learning framework for LLM error estimation and correction.
Figure 2: LLM error estimation using the POLAR score. (a) The LLM response error rate vs ten equal-interval POLAR score bins. (b) The POLAR scores are sorted and then binned where each bin contains 100 examples. The average of the LLM errors and POLAR scores are plotted for each bin. The last bin with the top POLAR scores may have less than 100 examples. (c) shows the average LLM error rate vs top percentile POLAR score examples.
Figure 3: (a) shows the GPT-4 error rate before and after re-prompting, as plotted against the POLAR score. (b) shows the performance improvement using the two dynamic prompting strategies in Section \ref{['sect-dynamic']}.

Theorems & Definitions (4)

Definition 1: Pareto optimal
Definition 2: Pareto aggregator
Theorem 1
proof

Pareto Optimal Learning for Estimating Large Language Model Errors

TL;DR

Abstract

Pareto Optimal Learning for Estimating Large Language Model Errors

Authors

TL;DR

Abstract

Table of Contents

Key Result

Figures (3)

Theorems & Definitions (4)