Table of Contents
Fetching ...

Does "Reasoning" with Large Language Models Improve Recognizing, Generating, and Reframing Unhelpful Thoughts?

Yilin Qi, Dong Won Lee, Cynthia Breazeal, Hae Won Park

TL;DR

The study shows that simply augmenting LLMs with reasoning strategies (e.g., DoT, Self-Consistency, CoT, ToT) yields substantial performance gains on cognitive reframing tasks over pretrained reasoning models. Across recognition, generation, and reframing tasks drawn from PatternReframe, augmented reasoning consistently improves outcomes, though alignment to specific reframing strategies remains challenging for comprehensive CBT deployment. A novel Task 4 assessing strategic reframing reveals a need for better alignment and controllable generation, highlighting practical considerations for clinical adoption. Overall, reasoning-augmented prompting emerges as an efficient, scalable route to enhance LLM-assisted cognitive-behavioral interventions, with implications for therapy-support tools and mental health chat systems.

Abstract

Cognitive Reframing, a core element of Cognitive Behavioral Therapy (CBT), helps individuals reinterpret negative experiences by finding positive meaning. Recent advances in Large Language Models (LLMs) have demonstrated improved performance through reasoning-based strategies. This inspires a promising direction of leveraging the reasoning capabilities of LLMs to improve CBT and mental reframing by simulating the process of critical thinking, potentially enabling more effective recognition, generation, and reframing of cognitive distortions. In this work, we investigate the role of various reasoning methods, including pre-trained reasoning LLMs and augmented reasoning strategies such as CoT and self-consistency in enhancing LLMs' ability to perform cognitive reframing tasks. We find that augmented reasoning methods, even when applied to "outdated" LLMs like GPT-3.5, consistently outperform state-of-the-art pretrained reasoning models on recognizing, generating and reframing unhelpful thoughts.

Does "Reasoning" with Large Language Models Improve Recognizing, Generating, and Reframing Unhelpful Thoughts?

TL;DR

The study shows that simply augmenting LLMs with reasoning strategies (e.g., DoT, Self-Consistency, CoT, ToT) yields substantial performance gains on cognitive reframing tasks over pretrained reasoning models. Across recognition, generation, and reframing tasks drawn from PatternReframe, augmented reasoning consistently improves outcomes, though alignment to specific reframing strategies remains challenging for comprehensive CBT deployment. A novel Task 4 assessing strategic reframing reveals a need for better alignment and controllable generation, highlighting practical considerations for clinical adoption. Overall, reasoning-augmented prompting emerges as an efficient, scalable route to enhance LLM-assisted cognitive-behavioral interventions, with implications for therapy-support tools and mental health chat systems.

Abstract

Cognitive Reframing, a core element of Cognitive Behavioral Therapy (CBT), helps individuals reinterpret negative experiences by finding positive meaning. Recent advances in Large Language Models (LLMs) have demonstrated improved performance through reasoning-based strategies. This inspires a promising direction of leveraging the reasoning capabilities of LLMs to improve CBT and mental reframing by simulating the process of critical thinking, potentially enabling more effective recognition, generation, and reframing of cognitive distortions. In this work, we investigate the role of various reasoning methods, including pre-trained reasoning LLMs and augmented reasoning strategies such as CoT and self-consistency in enhancing LLMs' ability to perform cognitive reframing tasks. We find that augmented reasoning methods, even when applied to "outdated" LLMs like GPT-3.5, consistently outperform state-of-the-art pretrained reasoning models on recognizing, generating and reframing unhelpful thoughts.

Paper Structure

This paper contains 14 sections, 3 figures, 4 tables.

Figures (3)

  • Figure 1: Performance for Representative Models in Each Class of Reasoning. Non-Reasoning Method : GPT-4o; Pre-trained Reasoning Method : o1; Reasoning-Augmented Method : GPT-3.5 + DoT; : GPT-3.5 + Self-Consistency.
  • Figure 2: Output Tokens compared to Performance for each method across Tasks 1,3 (•: Reasoning-Augmented models; •: Pretrained reasoning models; •: Non-Reasoning models). As indicated by the best performing model, encoded with a larger circle, we find that Reasoning-Augmented models can outperform Pretrained reasoning models. : Linear Regression fit on average output tokens to performance. We see a positive linear relationship between number of output tokens and performance for the task of recognition and a negative relationship for reframe generation.
  • Figure 3: Output Tokens compared to Performance for each method across Tasks 1, 2, 3, 4 (•: Reasoning-Augmented models; •: Pretrained reasoning models; •: Non-Reasoning models). As indicated by the best performing model, encoded with a larger circle, we find that Reasoning-Augmented models can outperform Pretrained reasoning models.