Table of Contents
Fetching ...

Reasoning in Conversation: Solving Subjective Tasks through Dialogue Simulation for Large Language Models

Xiaolong Wang, Yile Wang, Yuanchi Zhang, Fuwen Luo, Peng Li, Maosong Sun, Yang Liu

TL;DR

The paper tackles the challenge of subjective reasoning in LLMs, where interpretation and emotion play a central role and traditional chain-of-thought prompts often fall short. It introduces RiC, a tuning-free method that solves subjective tasks via dialogue simulation, comprising keywords extraction, dialogue-based scenario construction, and dialogue-enhanced reasoning, with an optional unified prompting variant. Across twelve subjective datasets and multiple models (GPT-4, ChatGPT, OpenChat), RiC delivers significant improvements in zero-shot and few-shot settings over strong baselines, highlighting the value of dialogue-derived contextual knowledge. The work demonstrates that simulating human-like dialogues can reveal useful information behind questions, offering a scalable and practical approach to improving subjective reasoning in LLMs and guiding future benchmark and domain-specific adaptations.

Abstract

Large Language Models (LLMs) have achieved remarkable performance in objective tasks such as open-domain question answering and mathematical reasoning, which can often be solved through recalling learned factual knowledge or chain-of-thought style reasoning. However, we find that the performance of LLMs in subjective tasks is still unsatisfactory, such as metaphor recognition, dark humor detection, etc. Compared to objective tasks, subjective tasks focus more on interpretation or emotional response rather than a universally accepted reasoning pathway. Based on the characteristics of the tasks and the strong dialogue-generation capabilities of LLMs, we propose RiC (Reasoning in Conversation), a method that focuses on solving subjective tasks through dialogue simulation. The motivation of RiC is to mine useful contextual information by simulating dialogues instead of supplying chain-of-thought style rationales, thereby offering potential useful knowledge behind dialogues for giving the final answers. We evaluate both API-based and open-source LLMs including GPT-4, ChatGPT, and OpenChat across twelve tasks. Experimental results show that RiC can yield significant improvement compared with various baselines.

Reasoning in Conversation: Solving Subjective Tasks through Dialogue Simulation for Large Language Models

TL;DR

The paper tackles the challenge of subjective reasoning in LLMs, where interpretation and emotion play a central role and traditional chain-of-thought prompts often fall short. It introduces RiC, a tuning-free method that solves subjective tasks via dialogue simulation, comprising keywords extraction, dialogue-based scenario construction, and dialogue-enhanced reasoning, with an optional unified prompting variant. Across twelve subjective datasets and multiple models (GPT-4, ChatGPT, OpenChat), RiC delivers significant improvements in zero-shot and few-shot settings over strong baselines, highlighting the value of dialogue-derived contextual knowledge. The work demonstrates that simulating human-like dialogues can reveal useful information behind questions, offering a scalable and practical approach to improving subjective reasoning in LLMs and guiding future benchmark and domain-specific adaptations.

Abstract

Large Language Models (LLMs) have achieved remarkable performance in objective tasks such as open-domain question answering and mathematical reasoning, which can often be solved through recalling learned factual knowledge or chain-of-thought style reasoning. However, we find that the performance of LLMs in subjective tasks is still unsatisfactory, such as metaphor recognition, dark humor detection, etc. Compared to objective tasks, subjective tasks focus more on interpretation or emotional response rather than a universally accepted reasoning pathway. Based on the characteristics of the tasks and the strong dialogue-generation capabilities of LLMs, we propose RiC (Reasoning in Conversation), a method that focuses on solving subjective tasks through dialogue simulation. The motivation of RiC is to mine useful contextual information by simulating dialogues instead of supplying chain-of-thought style rationales, thereby offering potential useful knowledge behind dialogues for giving the final answers. We evaluate both API-based and open-source LLMs including GPT-4, ChatGPT, and OpenChat across twelve tasks. Experimental results show that RiC can yield significant improvement compared with various baselines.
Paper Structure (16 sections, 4 equations, 5 figures, 17 tables)

This paper contains 16 sections, 4 equations, 5 figures, 17 tables.

Figures (5)

  • Figure 1: Illustration of our method. (a) An example of the metaphor recognition task. (b) Incorrect responses by LLM using zero-shot-CoT kojima2023large prompting. (c) Our method can simulate helpful dialogues (shown in the dashed box), thereby offering useful information in the generated conversation and aiding reasoning on this subjective task.
  • Figure 2: Illustration of simulated dialogues for the questions in different types of subjective tasks from Table \ref{['table:subjective_tasks']}.
  • Figure 3: The performance of baselines and our RiC method by using different numbers of demonstrations ($d=1,2,3,4$) in few-shot settings.
  • Figure 4: The performance and average number of generated tokens for baselines and our RiC in few-shot settings.
  • Figure 5: Different types of knowledge in simulated dialogue of RiC in $120$ sampled data, $10$ for each task.