Regression-aware Inference with LLMs

Michal Lukasik; Harikrishna Narasimhan; Aditya Krishna Menon; Felix Yu; Sanjiv Kumar

Regression-aware Inference with LLMs

Michal Lukasik, Harikrishna Narasimhan, Aditya Krishna Menon, Felix Yu, Sanjiv Kumar

TL;DR

This work builds on prior work on Minimum Bayes Risk decoding, and proposes alternate inference strategies that estimate the Bayes-optimal solution for regression and scoring metrics in closed-form from sampled responses.

Abstract

Large language models (LLMs) have shown strong results on a range of applications, including regression and scoring tasks. Typically, one obtains outputs from an LLM via autoregressive sampling from the model's output distribution. We show that this inference strategy can be sub-optimal for common regression and scoring evaluation metrics. As a remedy, we build on prior work on Minimum Bayes Risk decoding, and propose alternate inference strategies that estimate the Bayes-optimal solution for regression and scoring metrics in closed-form from sampled responses. We show that our proposal significantly improves over baselines across datasets and models.

Regression-aware Inference with LLMs

TL;DR

Abstract

Paper Structure (21 sections, 9 equations, 2 figures, 11 tables)

This paper contains 21 sections, 9 equations, 2 figures, 11 tables.

Introduction
When (naïve) LLM inference fails on regression tasks
Metric-aware LLM inference
Minimum Bayes risk decoding
Closed-form optimal solution
Post-hoc temperature scaling
Extension to multi-partite ranking
Experiments and Discussion
Conclusions
Limitations
Ethics Statement
Acknowledgements
Further related work
Minimum Bayes risk decoding.
Fine-tuning for target task alignment.
...and 6 more sections

Figures (2)

Figure 1: Illustration of metric-aware LLM inference for regression and scoring tasks. An input $x$ is passed to the LLM, and samples are drawn from the distribution over targets $y$ conditioned on $x$. These are then used to find the target optimizing a metric ${m}$ through a closed-form decision rule $\Phi$ (e.g., mean or median); Table \ref{['tbl:decision_rules']} presents specific solutions across metrics.
Figure 2: Examples from the Amazon dataset and the corresponding: human annotations and samples from the model. We find that in many cases, taking into account the model distribution (i.e. a mean of the distribution) allows for a prediction closer to the annotation than simply taking the mode of the distribution.

Regression-aware Inference with LLMs

TL;DR

Abstract

Regression-aware Inference with LLMs

Authors

TL;DR

Abstract

Table of Contents

Figures (2)