Table of Contents
Fetching ...

Enhancing Zero-shot Chain of Thought Prompting via Uncertainty-Guided Strategy Selection

Shanu Kumar, Saish Mendke, Karody Lubna Abdul Rahman, Santosh Kurasa, Parag Agrawal, Sandipan Dandapat

TL;DR

This work tackles the challenge of enhancing chain-of-thought prompting without manual demonstrations or access to model parameters. It introduces ZEUS, a zero-shot method that estimates uncertainty via perturbation-based analysis (temperature, trigger phrases, and rephrasing) and uses these estimates to select informative questions and construct diverse demonstrations through clustering. Across four reasoning benchmarks and multiple LLMs, ZEUS consistently matches or surpasses strong baselines like Zero-Shot-CoT and Auto-CoT, while avoiding manual annotations. The findings emphasize robust uncertainty signals and practical strategy selection (LU) that generalize across models, enabling scalable, unlabeled-cohort prompt engineering.

Abstract

Chain-of-thought (CoT) prompting has significantly enhanced the capability of large language models (LLMs) by structuring their reasoning processes. However, existing methods face critical limitations: handcrafted demonstrations require extensive human expertise, while trigger phrases are prone to inaccuracies. In this paper, we propose the Zero-shot Uncertainty-based Selection (ZEUS) method, a novel approach that improves CoT prompting by utilizing uncertainty estimates to select effective demonstrations without needing access to model parameters. Unlike traditional methods, ZEUS offers high sensitivity in distinguishing between helpful and ineffective questions, ensuring more precise and reliable selection. Our extensive evaluation shows that ZEUS consistently outperforms existing CoT strategies across four challenging reasoning benchmarks, demonstrating its robustness and scalability.

Enhancing Zero-shot Chain of Thought Prompting via Uncertainty-Guided Strategy Selection

TL;DR

This work tackles the challenge of enhancing chain-of-thought prompting without manual demonstrations or access to model parameters. It introduces ZEUS, a zero-shot method that estimates uncertainty via perturbation-based analysis (temperature, trigger phrases, and rephrasing) and uses these estimates to select informative questions and construct diverse demonstrations through clustering. Across four reasoning benchmarks and multiple LLMs, ZEUS consistently matches or surpasses strong baselines like Zero-Shot-CoT and Auto-CoT, while avoiding manual annotations. The findings emphasize robust uncertainty signals and practical strategy selection (LU) that generalize across models, enabling scalable, unlabeled-cohort prompt engineering.

Abstract

Chain-of-thought (CoT) prompting has significantly enhanced the capability of large language models (LLMs) by structuring their reasoning processes. However, existing methods face critical limitations: handcrafted demonstrations require extensive human expertise, while trigger phrases are prone to inaccuracies. In this paper, we propose the Zero-shot Uncertainty-based Selection (ZEUS) method, a novel approach that improves CoT prompting by utilizing uncertainty estimates to select effective demonstrations without needing access to model parameters. Unlike traditional methods, ZEUS offers high sensitivity in distinguishing between helpful and ineffective questions, ensuring more precise and reliable selection. Our extensive evaluation shows that ZEUS consistently outperforms existing CoT strategies across four challenging reasoning benchmarks, demonstrating its robustness and scalability.

Paper Structure

This paper contains 16 sections, 3 equations, 16 figures, 4 tables.

Figures (16)

  • Figure 1: Overview of ZEUS: Uncertainty for a question $q_j$ is calculated using a pool of answers generated using various prompts, including trigger phrases, non-zero temperature-based decoding, and rephrasing of $q_j$. Subsequently, questions with uncertainty within a certain range are selected and used for constructing demonstrations.
  • Figure 2: Mean and standard deviation of uncertainty values as error graph -specific statistics across models.
  • Figure 3: Probability density function of uncertainty estimates of our method using GPT3.5 on GSM8K.
  • Figure 4: Normalized values of accuracy for various selection strategies using multiple LLMs.
  • Figure 5: Sensitivity coefficient of confidence score wrt accuracy. Blue indicates ZEUS and Magenta for Temp-Perb. Solid for GPT3-XL and Dashed for GPT3.5. Coefficient using ZEUS is closest to ideal coefficient.
  • ...and 11 more figures