Table of Contents
Fetching ...

An Examination on the Effectiveness of Divide-and-Conquer Prompting in Large Language Models

Yizhou Zhang, Lun Du, Defu Cao, Qiang Fu, Yan Liu

TL;DR

The paper analyzes divide-and-conquer prompting (DaC) for large language models, deriving a theoretical framework that situates DaC as more expressive than IO prompting ($S(IO) \subseteq TC^0 \subseteq NC^1 \subseteq S(DaC)$) while being subsumed by CoT in general ($S(DaC) \subseteq S(CoT)$). It further argues that DaC can reduce decoding context length for tasks with many parallel sub-tasks, potentially lowering intermediate errors. Empirically, DaC outperforms baselines on long integer multiplication and article-level fact verification, while showing limited gains on simple additions; hallmarks include improved recall in hallucination and misinformation detection compared to CoT-based methods. The work offers concrete conditions for when DaC is advantageous and provides guidance for prompt engineering in long-context or deceptive-content tasks, along with a framework for extending DaC to new domains.

Abstract

Foundation models, such as Large language Models (LLMs), have attracted significant amount of interest due to their large number of applications. However, when handling tasks involving repetitive sub-tasks and/or deceptive contents, such as arithmetic calculation and article-level fake news detection, simple instructional prompts suffer from inaccurate responses. Existing works show that more complicated prompting strategies, such as Chain-of-Thoughts and Least-to-Most, can unlock LLM's powerful capacity in diverse areas. Recent researches reveal that simple divide-and-conquer prompting strategy, i.e. simply dividing the input sequence to multiple sub-inputs, can also substantially improve LLM's performance in some specific tasks such as misinformation detection. In this paper, we aim at examining the utility of divide-and-conquer prompting strategy and answer on which kind of tasks this strategy gets advantages. Specifically, we provide a theoretic analysis to divide-and-conquer prompting strategy and help us identify the specific tasks where DaC prompting can bring performance boost with theoretic guarantee. We then present two cases (large integer arithmetic and fact verification) where experimental results aligns with our theoretic analysis.

An Examination on the Effectiveness of Divide-and-Conquer Prompting in Large Language Models

TL;DR

The paper analyzes divide-and-conquer prompting (DaC) for large language models, deriving a theoretical framework that situates DaC as more expressive than IO prompting () while being subsumed by CoT in general (). It further argues that DaC can reduce decoding context length for tasks with many parallel sub-tasks, potentially lowering intermediate errors. Empirically, DaC outperforms baselines on long integer multiplication and article-level fact verification, while showing limited gains on simple additions; hallmarks include improved recall in hallucination and misinformation detection compared to CoT-based methods. The work offers concrete conditions for when DaC is advantageous and provides guidance for prompt engineering in long-context or deceptive-content tasks, along with a framework for extending DaC to new domains.

Abstract

Foundation models, such as Large language Models (LLMs), have attracted significant amount of interest due to their large number of applications. However, when handling tasks involving repetitive sub-tasks and/or deceptive contents, such as arithmetic calculation and article-level fake news detection, simple instructional prompts suffer from inaccurate responses. Existing works show that more complicated prompting strategies, such as Chain-of-Thoughts and Least-to-Most, can unlock LLM's powerful capacity in diverse areas. Recent researches reveal that simple divide-and-conquer prompting strategy, i.e. simply dividing the input sequence to multiple sub-inputs, can also substantially improve LLM's performance in some specific tasks such as misinformation detection. In this paper, we aim at examining the utility of divide-and-conquer prompting strategy and answer on which kind of tasks this strategy gets advantages. Specifically, we provide a theoretic analysis to divide-and-conquer prompting strategy and help us identify the specific tasks where DaC prompting can bring performance boost with theoretic guarantee. We then present two cases (large integer arithmetic and fact verification) where experimental results aligns with our theoretic analysis.
Paper Structure (23 sections, 6 theorems, 11 equations, 7 figures, 3 tables, 7 algorithms)

This paper contains 23 sections, 6 theorems, 11 equations, 7 figures, 3 tables, 7 algorithms.

Key Result

Theorem 4.1

We denote the set of problems that a fixed-precision transformer with fixed-length IO prompting can tackle as $S(IO)$. Similarly, we denote the set of problems that a fixed-precision transformer with DaC prompting can tackle as $S(DaC)$. Then we have the following results:

Figures (7)

  • Figure 1: An illustrative example of hallucination detection with entangled problem solving (i.e., directly forward all inputs into the LLM) and divide-and-conquer problem solving (i.e., divide the problem inputs to parallel sub-tasks and tackle them parallelly). The sentence marked with red back font in the material is the evidence that contradict with the first claim in summary (marked with red font).
  • Figure 2: The comparison between DaC and the existing methods for prompting. The ellipse marks represent sub-tasks, the right-angled rectangles represent sub-task solutions, and the rounded rectangles represent intermediate steps that entangle sub-task and sub-solutions. The different shades in Tree of Thoughts (subfigure D) indicate the rates of different search directions. In CoT (Chain-of-Thoughts), CoT-SC and ToT, the Large Language Models must simultaneously generating and resolving sub-tasks. Least-to-Most (also Decomposed Prompting) disentangle sub-task generation and resolution. However, its sub-task resolution and resolution assembly process are intertwined as it sequentially attach new sub-tasks onto the previous resolution. Different from them, DaC totally disentangle the sub-task generation, sub-task resolution, and resolution assembly process.
  • Figure 3: Edit distance of DaC and baseline prompting strategies on GPT-3.5 and GPT-4 for Multiplication.
  • Figure 4: Edit distance of DaC and baseline prompting strategies on GPT-3.5 and GPT-4 for Addition.
  • Figure 5: Comparison of Least-to-Most (LtM) Prompting and Decomposed Prompting (DeP).
  • ...and 2 more figures

Theorems & Definitions (11)

  • Theorem 4.1
  • Definition 1
  • Theorem 4.2
  • Proposition 4.3
  • proof
  • Definition 2
  • Proposition 4.4
  • Lemma A.1
  • proof
  • Theorem A.2
  • ...and 1 more