Understanding the Effects of Iterative Prompting on Truthfulness

Satyapriya Krishna; Chirag Agarwal; Himabindu Lakkaraju

Understanding the Effects of Iterative Prompting on Truthfulness

Satyapriya Krishna, Chirag Agarwal, Himabindu Lakkaraju

TL;DR

This work investigates how iterative prompting affects the truthfulness of large language models. It introduces Start Prompt and Iteration Prompt components, formalizing the process as $R_i = \mathcal{M}(P, I_P, \{R_0, \ldots, R_{i-1}\})$ with the objective of maximizing $\mathcal{L}(R_N)$, such as accuracy. Through experiments on TruthfulQA with GPT-3.5, the study finds that naive prompts degrade accuracy and calibration due to sycophantic apologies, but two improved prompt variants (Improved Prompt-1 and Improved Prompt-2) substantially mitigate these issues and outperform baseline iterative methods like Self-Consistency. The results underscore the critical role of prompt design in enhancing truthfulness and calibration, offering a path toward more trustworthy AI systems.

Abstract

The development of Large Language Models (LLMs) has notably transformed numerous sectors, offering impressive text generation capabilities. Yet, the reliability and truthfulness of these models remain pressing concerns. To this end, we investigate iterative prompting, a strategy hypothesized to refine LLM responses, assessing its impact on LLM truthfulness, an area which has not been thoroughly explored. Our extensive experiments delve into the intricacies of iterative prompting variants, examining their influence on the accuracy and calibration of model responses. Our findings reveal that naive prompting methods significantly undermine truthfulness, leading to exacerbated calibration errors. In response to these challenges, we introduce several prompting variants designed to address the identified issues. These variants demonstrate marked improvements over existing baselines, signaling a promising direction for future research. Our work provides a nuanced understanding of iterative prompting and introduces novel approaches to enhance the truthfulness of LLMs, thereby contributing to the development of more accurate and trustworthy AI systems.

Understanding the Effects of Iterative Prompting on Truthfulness

TL;DR

This work investigates how iterative prompting affects the truthfulness of large language models. It introduces Start Prompt and Iteration Prompt components, formalizing the process as

with the objective of maximizing

, such as accuracy. Through experiments on TruthfulQA with GPT-3.5, the study finds that naive prompts degrade accuracy and calibration due to sycophantic apologies, but two improved prompt variants (Improved Prompt-1 and Improved Prompt-2) substantially mitigate these issues and outperform baseline iterative methods like Self-Consistency. The results underscore the critical role of prompt design in enhancing truthfulness and calibration, offering a path toward more trustworthy AI systems.

Abstract

Paper Structure (13 sections, 2 equations, 20 figures)

This paper contains 13 sections, 2 equations, 20 figures.

Introduction
Related Works
Iterative Prompting
Experiment Setup
Results
Effects of Naive Iterative Prompting
Improved Iteration Prompt
Comparison Against Other Iterative Prompting
Prompt Sensitivity
Conclusion
Additional Results
Variants of Naive Iterative Prompting
Variants of Improved Iterative Prompting

Figures (20)

Figure 1: Iterative Prompting Framework. It comprises of two steps : (1) Start Prompt : Initial task introduction for LLMs, and (2) Iterative Prompting : Re-prompting the LLM with its response for self-assessment and improvement. Ideally, we would like the model to correct its response post iterative promptings.
Figure 2: Prompt Design
Figure 3: Effect of iterative prompting on TruthfulQA. We observe significant decline in accuracy, with the number of incorrect answer flips markedly exceeding that of correct flips.
Figure 4: Naive Prompting ECE on TruthfulQA. There is sharp rise in ECE from the start response (iteration 1) to the second response (iteration 2) which leads to a significant drop in truthfulness accuracy.
Figure 5: [Naive Prompting] Sequence of question and response iterations on identifying if Canada is part of the UK. The assistant's answers change from being correct to incorrect when enquired to confirm its response.
...and 15 more figures

Understanding the Effects of Iterative Prompting on Truthfulness

TL;DR

Abstract

Understanding the Effects of Iterative Prompting on Truthfulness

Authors

TL;DR

Abstract

Table of Contents

Figures (20)