Table of Contents
Fetching ...

Towards Understanding Counseling Conversations: Domain Knowledge and Large Language Models

Younghun Lee, Dan Goldwasser, Laura Schwab Reese

TL;DR

This work investigates how to better understand counseling conversations by integrating human-annotated domain knowledge and LLM-generated features into predictive models of post-conversation positivity. It contrasts baseline Transformer models with knowledge-infused approaches that leverage utterance-level counseling strategies and session-level, LLM-derived features, including both structured prompts and free-form summaries. The study demonstrates that simple feature integration improves predictive performance by about 15%, with ensemble methods achieving the strongest results (Macro F1 ~71.3 and higher minority-class recall). Findings suggest that domain knowledge helps highlight counselor strategies, while session-level features provide robust representations for long conversations, collectively advancing practical understanding and support in crisis counseling contexts.

Abstract

Understanding the dynamics of counseling conversations is an important task, yet it is a challenging NLP problem regardless of the recent advance of Transformer-based pre-trained language models. This paper proposes a systematic approach to examine the efficacy of domain knowledge and large language models (LLMs) in better representing conversations between a crisis counselor and a help seeker. We empirically show that state-of-the-art language models such as Transformer-based models and GPT models fail to predict the conversation outcome. To provide richer context to conversations, we incorporate human-annotated domain knowledge and LLM-generated features; simple integration of domain knowledge and LLM features improves the model performance by approximately 15%. We argue that both domain knowledge and LLM-generated features can be exploited to better characterize counseling conversations when they are used as an additional context to conversations.

Towards Understanding Counseling Conversations: Domain Knowledge and Large Language Models

TL;DR

This work investigates how to better understand counseling conversations by integrating human-annotated domain knowledge and LLM-generated features into predictive models of post-conversation positivity. It contrasts baseline Transformer models with knowledge-infused approaches that leverage utterance-level counseling strategies and session-level, LLM-derived features, including both structured prompts and free-form summaries. The study demonstrates that simple feature integration improves predictive performance by about 15%, with ensemble methods achieving the strongest results (Macro F1 ~71.3 and higher minority-class recall). Findings suggest that domain knowledge helps highlight counselor strategies, while session-level features provide robust representations for long conversations, collectively advancing practical understanding and support in crisis counseling contexts.

Abstract

Understanding the dynamics of counseling conversations is an important task, yet it is a challenging NLP problem regardless of the recent advance of Transformer-based pre-trained language models. This paper proposes a systematic approach to examine the efficacy of domain knowledge and large language models (LLMs) in better representing conversations between a crisis counselor and a help seeker. We empirically show that state-of-the-art language models such as Transformer-based models and GPT models fail to predict the conversation outcome. To provide richer context to conversations, we incorporate human-annotated domain knowledge and LLM-generated features; simple integration of domain knowledge and LLM features improves the model performance by approximately 15%. We argue that both domain knowledge and LLM-generated features can be exploited to better characterize counseling conversations when they are used as an additional context to conversations.
Paper Structure (25 sections, 4 figures, 7 tables)

This paper contains 25 sections, 4 figures, 7 tables.

Figures (4)

  • Figure 1: Shapley value of phrases in the counseling conversation (upper) and the conversation with utterance level features (lower). Highlighted area in red contributes the models to predict 'negative' class, and area in blue contributes the opposite.
  • Figure 2: F1 score comparison between session level feature input and summaries with stance. Performance of summary with stance decreases when the length of the counseling conversation exceeds 3K tokens, while session level feature input shows more consistent performance.
  • Figure 3: Distortion values of different number of clusters. Blue line indicates distortion values
  • Figure 4: Clustered sentences from two types of summaries. In most case, plain summary and summary with stance produces similar aspects regarding the conversation. There are a few clusters where the portion of one summary type is meaningfully larger than the other type. Cluster 3, 5, 8 consists of around 60$\%$ of plain summary items, while cluster 1 has the opposite distribution. Cluster 4, describing the stance of the help seeker, only contains summary with stance items.