Table of Contents
Fetching ...

Zero-Shot Chain-of-Thought Reasoning Guided by Evolutionary Algorithms in Large Language Models

Feihu Jin, Yifan Liu, Ying Tan

TL;DR

This paper tackles the limitation of uniform zero-shot Chain-of-Thought prompting by introducing zero-shot EoT prompting, which uses an LLM as an evolutionary optimizer to generate diverse, instance-specific CoT prompts. A selected prompt then guides problem rewriting and intermediate reasoning, with an answer extraction step to ensure measurable outputs. Across ten arithmetic, commonsense, and symbolic datasets, EoT outperforms standard zero-shot CoT, PS/PS+, and RE2 prompts and approaches the performance of few-shot CoT, with robust gains observed on both GPT-3.5-turbo and GPT-4. Ablation and self-consistency analyses deepen understanding of the method’s components, showing the importance of rewriting and evolutionary operations and indicating practical benefits for complex reasoning tasks.

Abstract

Large Language Models (LLMs) have demonstrated remarkable performance across diverse tasks and exhibited impressive reasoning abilities by applying zero-shot Chain-of-Thought (CoT) prompting. However, due to the evolving nature of sentence prefixes during the pre-training phase, existing zero-shot CoT prompting methods that employ identical CoT prompting across all task instances may not be optimal. In this paper, we introduce a novel zero-shot prompting method that leverages evolutionary algorithms to generate diverse promptings for LLMs dynamically. Our approach involves initializing two CoT promptings, performing evolutionary operations based on LLMs to create a varied set, and utilizing the LLMs to select a suitable CoT prompting for a given problem. Additionally, a rewriting operation, guided by the selected CoT prompting, enhances the understanding of the LLMs about the problem. Extensive experiments conducted across ten reasoning datasets demonstrate the superior performance of our proposed method compared to current zero-shot CoT prompting methods on GPT-3.5-turbo and GPT-4. Moreover, in-depth analytical experiments underscore the adaptability and effectiveness of our method in various reasoning tasks.

Zero-Shot Chain-of-Thought Reasoning Guided by Evolutionary Algorithms in Large Language Models

TL;DR

This paper tackles the limitation of uniform zero-shot Chain-of-Thought prompting by introducing zero-shot EoT prompting, which uses an LLM as an evolutionary optimizer to generate diverse, instance-specific CoT prompts. A selected prompt then guides problem rewriting and intermediate reasoning, with an answer extraction step to ensure measurable outputs. Across ten arithmetic, commonsense, and symbolic datasets, EoT outperforms standard zero-shot CoT, PS/PS+, and RE2 prompts and approaches the performance of few-shot CoT, with robust gains observed on both GPT-3.5-turbo and GPT-4. Ablation and self-consistency analyses deepen understanding of the method’s components, showing the importance of rewriting and evolutionary operations and indicating practical benefits for complex reasoning tasks.

Abstract

Large Language Models (LLMs) have demonstrated remarkable performance across diverse tasks and exhibited impressive reasoning abilities by applying zero-shot Chain-of-Thought (CoT) prompting. However, due to the evolving nature of sentence prefixes during the pre-training phase, existing zero-shot CoT prompting methods that employ identical CoT prompting across all task instances may not be optimal. In this paper, we introduce a novel zero-shot prompting method that leverages evolutionary algorithms to generate diverse promptings for LLMs dynamically. Our approach involves initializing two CoT promptings, performing evolutionary operations based on LLMs to create a varied set, and utilizing the LLMs to select a suitable CoT prompting for a given problem. Additionally, a rewriting operation, guided by the selected CoT prompting, enhances the understanding of the LLMs about the problem. Extensive experiments conducted across ten reasoning datasets demonstrate the superior performance of our proposed method compared to current zero-shot CoT prompting methods on GPT-3.5-turbo and GPT-4. Moreover, in-depth analytical experiments underscore the adaptability and effectiveness of our method in various reasoning tasks.
Paper Structure (21 sections, 4 equations, 2 figures, 16 tables)

This paper contains 21 sections, 4 equations, 2 figures, 16 tables.

Figures (2)

  • Figure 1: Example inputs and outputs of GPT-3.5-turbo with (a) Zero-shot CoT prompting and (b) Zero-shot EoT prompting. Zero-shot CoT prompting attaches the sentence "Let's think step by step" for each instance to encourage LLMs to generate multi-step reasoning. Our proposed method, EoT prompting, uses the LLMs as an evolutionary optimizer and generates suitable CoT prompting for each instance.
  • Figure 2: Results of different population size $N$ measured on four math reasoning datasets with GPT-3.5-turbo.