Are LLMs Aware that Some Questions are not Open-ended?

Dongjie Yang; Hai Zhao

Are LLMs Aware that Some Questions are not Open-ended?

Dongjie Yang, Hai Zhao

TL;DR

This paper proposes a method called Question Awareness Temperature Sampling (QuATS), which enhances the question awareness of LLMs by adaptively adjusting the output distributions based on question features and eliminates the need for manual temperature tuning in text generation and consistently improves model performance in various benchmarks.

Abstract

Large Language Models (LLMs) have shown the impressive capability of answering questions in a wide range of scenarios. However, when LLMs face different types of questions, it is worth exploring whether LLMs are aware that some questions have limited answers and need to respond more deterministically but some do not. We refer to this as question awareness of LLMs. The lack of question awareness in LLMs leads to two phenomena that LLMs are: (1) too casual to answer non-open-ended questions or (2) too boring to answer open-ended questions. In this paper, we first evaluate the question awareness in LLMs. The experimental results show that LLMs have the issues of lacking awareness of questions in certain domains, e.g. factual knowledge, resulting in hallucinations during the generation. To mitigate these, we propose a method called Question Awareness Temperature Sampling (QuATS). This method enhances the question awareness of LLMs by adaptively adjusting the output distributions based on question features. The automatic adjustment in QuATS eliminates the need for manual temperature tuning in text generation and consistently improves model performance in various benchmarks.

Are LLMs Aware that Some Questions are not Open-ended?

TL;DR

Abstract

Paper Structure (26 sections, 10 equations, 5 figures, 2 tables, 1 algorithm)

This paper contains 26 sections, 10 equations, 5 figures, 2 tables, 1 algorithm.

Introduction
Question Awareness Evaluation
Formulation of the Next Token Prediction
Metric
Evaluation Process
Results and Analysis
LLMs lack a strong sense of question awareness.
Question awareness greatly affects model performance.
Larger models have more confidence in text generation.
Question Awareness Temperature Sampling
Training A DetBlock to Predict Determinacy
Training Dataset
DetBlock Structure
Training Process
Inference with QuATS
...and 11 more sections

Figures (5)

Figure 1: LLMs should choose to be deterministic to answer the question on the left but can have more choices to answer the one on the right.
Figure 2: The result of question awareness evaluation. The dotted lines are the trend lines of the kurtosises, which are linearly fitted.
Figure 3: Comparison between QuATS and baselines with different fixed temperatures using LLaMA 2-Chat 13B touvron2023llama on downstream tasks. The temperatures adjust the kurtosises, which influence the performance in open-ended and non-open-ended questions differently. In contrast, the adaptive temperature strategy of QuATS consistently outperforms temperature sampling with fixed temperatures.
Figure 4: The overview of the QuATS.
Figure 5: The result of question awareness evaluation of LLaMA 2-Chat models using the QuATS.

Are LLMs Aware that Some Questions are not Open-ended?

TL;DR

Abstract

Are LLMs Aware that Some Questions are not Open-ended?

Authors

TL;DR

Abstract

Table of Contents

Figures (5)