Table of Contents
Fetching ...

Are LLMs Aware that Some Questions are not Open-ended?

Dongjie Yang, Hai Zhao

TL;DR

This paper proposes a method called Question Awareness Temperature Sampling (QuATS), which enhances the question awareness of LLMs by adaptively adjusting the output distributions based on question features and eliminates the need for manual temperature tuning in text generation and consistently improves model performance in various benchmarks.

Abstract

Large Language Models (LLMs) have shown the impressive capability of answering questions in a wide range of scenarios. However, when LLMs face different types of questions, it is worth exploring whether LLMs are aware that some questions have limited answers and need to respond more deterministically but some do not. We refer to this as question awareness of LLMs. The lack of question awareness in LLMs leads to two phenomena that LLMs are: (1) too casual to answer non-open-ended questions or (2) too boring to answer open-ended questions. In this paper, we first evaluate the question awareness in LLMs. The experimental results show that LLMs have the issues of lacking awareness of questions in certain domains, e.g. factual knowledge, resulting in hallucinations during the generation. To mitigate these, we propose a method called Question Awareness Temperature Sampling (QuATS). This method enhances the question awareness of LLMs by adaptively adjusting the output distributions based on question features. The automatic adjustment in QuATS eliminates the need for manual temperature tuning in text generation and consistently improves model performance in various benchmarks.

Are LLMs Aware that Some Questions are not Open-ended?

TL;DR

This paper proposes a method called Question Awareness Temperature Sampling (QuATS), which enhances the question awareness of LLMs by adaptively adjusting the output distributions based on question features and eliminates the need for manual temperature tuning in text generation and consistently improves model performance in various benchmarks.

Abstract

Large Language Models (LLMs) have shown the impressive capability of answering questions in a wide range of scenarios. However, when LLMs face different types of questions, it is worth exploring whether LLMs are aware that some questions have limited answers and need to respond more deterministically but some do not. We refer to this as question awareness of LLMs. The lack of question awareness in LLMs leads to two phenomena that LLMs are: (1) too casual to answer non-open-ended questions or (2) too boring to answer open-ended questions. In this paper, we first evaluate the question awareness in LLMs. The experimental results show that LLMs have the issues of lacking awareness of questions in certain domains, e.g. factual knowledge, resulting in hallucinations during the generation. To mitigate these, we propose a method called Question Awareness Temperature Sampling (QuATS). This method enhances the question awareness of LLMs by adaptively adjusting the output distributions based on question features. The automatic adjustment in QuATS eliminates the need for manual temperature tuning in text generation and consistently improves model performance in various benchmarks.
Paper Structure (26 sections, 10 equations, 5 figures, 2 tables, 1 algorithm)

This paper contains 26 sections, 10 equations, 5 figures, 2 tables, 1 algorithm.

Figures (5)

  • Figure 1: LLMs should choose to be deterministic to answer the question on the left but can have more choices to answer the one on the right.
  • Figure 2: The result of question awareness evaluation. The dotted lines are the trend lines of the kurtosises, which are linearly fitted.
  • Figure 3: Comparison between QuATS and baselines with different fixed temperatures using LLaMA 2-Chat 13B touvron2023llama on downstream tasks. The temperatures adjust the kurtosises, which influence the performance in open-ended and non-open-ended questions differently. In contrast, the adaptive temperature strategy of QuATS consistently outperforms temperature sampling with fixed temperatures.
  • Figure 4: The overview of the QuATS.
  • Figure 5: The result of question awareness evaluation of LLaMA 2-Chat models using the QuATS.