Hint-before-Solving Prompting: Guiding LLMs to Effectively Utilize Encoded Knowledge

Jinlan Fu; Shenzhen Huangfu; Hang Yan; See-Kiong Ng; Xipeng Qiu

Hint-before-Solving Prompting: Guiding LLMs to Effectively Utilize Encoded Knowledge

Jinlan Fu, Shenzhen Huangfu, Hang Yan, See-Kiong Ng, Xipeng Qiu

TL;DR

Hint-before-Solving prompting (HSP) guides LLMs to emit useful hints before solving, helping them leverage encoded knowledge to produce more accurate and coherent intermediate reasoning. By integrating HSP with existing prompting paradigms (CoT, LtM, PS, and standard), and in a two-stage variant HSP2, the approach yields substantial improvements across six reasoning benchmarks and four open-source LLMs, including a roughly 9.7% average relative gain for CoT on Llama-2-Chat-70B. The authors also introduce HSPMATH, a large-scale hint-enhanced dataset, and show that supervised fine-tuning on HSP-format data (HSPMATH) with Llama-7B can reach 64.3% accuracy, surpassing GPT-3.5 and WizardMath-13B. Overall, HSP demonstrates that high-quality, self-generated hints can improve reasoning, particularly when combined with self-consistency and higher-capacity open models, with practical implications for decoding the use of encoded knowledge in complex tasks.

Abstract

Large Language Models (LLMs) have recently showcased remarkable generalizability in various domains. Despite their extensive knowledge, LLMs still face challenges in efficiently utilizing encoded knowledge to develop accurate and logical reasoning processes. To mitigate this problem, we introduced Hint-before-Solving Prompting (HSP), which guides the model to generate hints (e.g., specific knowledge or key ideas) for solving the problem and then generate solutions containing intermediate reasoning steps. Since HSP is orthogonal to prompting methods (e.g., Chain-of-Thought (CoT)), we applied HSP to CoT, Least-to-Most, Plan-and-Solve, and Standard promptings. The results of extensive experiments on 6 reasoning benchmarks and 4 open-source LLMs demonstrate that HSP can effectively improve the accuracy of reasoning tasks: (1) By applying high-quality hint-enhanced HSP to CoT prompting, Llama2-70B-Chat shows an improvement of 9.7. (2) Beyond exploring training-free LLM capabilities, we built the HSPMATH dataset based on HSP and fine-tuned Llemma-7B, reaching 64.3 accuracy, surpassing GPT-3.5 and WizardMath-13B. We make our code and dataset publicly available at \url{https://github.com/jinlanfu/HSP}.

Hint-before-Solving Prompting: Guiding LLMs to Effectively Utilize Encoded Knowledge

TL;DR

Abstract

Paper Structure (35 sections, 6 figures, 12 tables)

This paper contains 35 sections, 6 figures, 12 tables.

Introduction
Hint-before-Solving Prompting
Experiment
Large Language Model
Datasets
Commonsense Reasoning
Baselines
Experimental Settings
Demonstration examples
Hyperparameters of Greedy Decoding
Experiments and Results
Q1: Can HSP Work?
Exp-I: When HSP Meets Existing Prompting Methods
EXP-II: Effectiveness of HSP for CoT Prompting
Exp-III: The Impact of Hint Quality
...and 20 more sections

Figures (6)

Figure 1: The output comparison of Llama-2-Chat-70B solving a math problem (calculus) with and without a hint. Red text indicates erroneous information; green text indicates correct reasoning. Findings: (1) having a hint can help the LLM understand the problem. (2) The LLM possesses knowledge of calculus, and with a hint, it can accurately apply this knowledge.
Figure 2: Results for Llama-2-Chat-70B (under CoT prompting) with or without introducing high-quality hints across six reasoning datasets. Findings: introducing hints lead to significant improvements, with an average relative increase of 9.7%.
Figure 3: Examples of input and output before (four examples at the top) and after (four examples at the bottom) applying HSP to standard Least-to-Most, Plan-and-Solve, and CoT promptings. The red text in the textbox indicates hints. We find that hints from LLMs, including problem-solving ideas close to the correct answer (e.g., geographical distributions of both species), guide LLMs to use accurate knowledge for correct and logical reasoning.
Figure 4: The relative performance improvement of self-consistency between CoT+HSP and CoT. The numbers of sample paths are 4, 16, 32, and 128, and the model temperature is 0.4.
Figure 5: The ratio of solution lengths between CoT and HSP+CoT (HSP applied to CoT prompting). The red line (y=1) indicates that the solution lengths of CoT equals to HSP+CoT.
...and 1 more figures

Hint-before-Solving Prompting: Guiding LLMs to Effectively Utilize Encoded Knowledge

TL;DR

Abstract

Hint-before-Solving Prompting: Guiding LLMs to Effectively Utilize Encoded Knowledge

Authors

TL;DR

Abstract

Table of Contents

Figures (6)