Large Language Models are Capable of Offering Cognitive Reappraisal, if Guided
Hongli Zhan, Allen Zheng, Yoon Kyung Lee, Jina Suh, Junyi Jessy Li, Desmond C. Ong
TL;DR
This work demonstrates that Large Language Models can be guided to produce targeted cognitive reappraisals for emotional support by introducing RESORT, a psychologist-informed framework of six appraisal dimensions and constitutions. It implements two prompting pipelines—Individual Guided Reappraisal and Iterative Guided Refinement—to generate dimension-specific reappraisals, with an optional explicit appraisal identification step. In a first-of-its-kind expert evaluation, clinical psychologists assess LLM-generated reappraisals across Reddit posts, finding that RESORT-guided outputs, including from open-source models, outperform human references in alignment and empathy while maintaining low harmfulness and reasonable factuality. The study also shows GPT-4 can serve as a rapid automatic evaluator with moderate agreement to human judges, highlighting potential for rapid prototyping and future development of psychologically-grounded AI assistants for emotional well-being.
Abstract
Large language models (LLMs) have offered new opportunities for emotional support, and recent work has shown that they can produce empathic responses to people in distress. However, long-term mental well-being requires emotional self-regulation, where a one-time empathic response falls short. This work takes a first step by engaging with cognitive reappraisals, a strategy from psychology practitioners that uses language to targetedly change negative appraisals that an individual makes of the situation; such appraisals is known to sit at the root of human emotional experience. We hypothesize that psychologically grounded principles could enable such advanced psychology capabilities in LLMs, and design RESORT which consists of a series of reappraisal constitutions across multiple dimensions that can be used as LLM instructions. We conduct a first-of-its-kind expert evaluation (by clinical psychologists with M.S. or Ph.D. degrees) of an LLM's zero-shot ability to generate cognitive reappraisal responses to medium-length social media messages asking for support. This fine-grained evaluation showed that even LLMs at the 7B scale guided by RESORT are capable of generating empathic responses that can help users reappraise their situations.
