Table of Contents
Fetching ...

Bridging Context Gaps: Enhancing Comprehension in Long-Form Social Conversations Through Contextualized Excerpts

Shrestha Mohanty, Sarah Xuan, Jacob Jobraeel, Anurag Kumar, Deb Roy, Jad Kabbara

TL;DR

This work tackles the challenge of understanding long-form social group conversations when only highlighted excerpts are shared across contexts. It proposes effective contextualization via LLMs to inject socially grounded attributes, producing context-enriched excerpts that improve comprehension, readability, and empathy, and it validates these gains through both subjective human judgments and objective faithfulness metrics. A new resource, the Human-annotated Salient Excerpts (HSE) dataset built on Fora, enables systematic evaluation of contextualization strategies, with two approaches—implicit and explicit contextualization—showing that explicit prompting generally yields better results. The study also demonstrates that context-enriched excerpts can produce more coherent and detailed conversation summaries than traditional full-conversation or excerpt-only summaries, underscoring the potential for improved communication and understanding in civic and social discourse. While promising, the work notes limitations in capturing nuanced social aspects, terminology, and background details, and provides a foundation for future work in socially aware NLP and long-form conversational analysis.

Abstract

We focus on enhancing comprehension in small-group recorded conversations, which serve as a medium to bring people together and provide a space for sharing personal stories and experiences on crucial social matters. One way to parse and convey information from these conversations is by sharing highlighted excerpts in subsequent conversations. This can help promote a collective understanding of relevant issues, by highlighting perspectives and experiences to other groups of people who might otherwise be unfamiliar with and thus unable to relate to these experiences. The primary challenge that arises then is that excerpts taken from one conversation and shared in another setting might be missing crucial context or key elements that were previously introduced in the original conversation. This problem is exacerbated when conversations become lengthier and richer in themes and shared experiences. To address this, we explore how Large Language Models (LLMs) can enrich these excerpts by providing socially relevant context. We present approaches for effective contextualization to improve comprehension, readability, and empathy. We show significant improvements in understanding, as assessed through subjective and objective evaluations. While LLMs can offer valuable context, they struggle with capturing key social aspects. We release the Human-annotated Salient Excerpts (HSE) dataset to support future work. Additionally, we show how context-enriched excerpts can provide more focused and comprehensive conversation summaries.

Bridging Context Gaps: Enhancing Comprehension in Long-Form Social Conversations Through Contextualized Excerpts

TL;DR

This work tackles the challenge of understanding long-form social group conversations when only highlighted excerpts are shared across contexts. It proposes effective contextualization via LLMs to inject socially grounded attributes, producing context-enriched excerpts that improve comprehension, readability, and empathy, and it validates these gains through both subjective human judgments and objective faithfulness metrics. A new resource, the Human-annotated Salient Excerpts (HSE) dataset built on Fora, enables systematic evaluation of contextualization strategies, with two approaches—implicit and explicit contextualization—showing that explicit prompting generally yields better results. The study also demonstrates that context-enriched excerpts can produce more coherent and detailed conversation summaries than traditional full-conversation or excerpt-only summaries, underscoring the potential for improved communication and understanding in civic and social discourse. While promising, the work notes limitations in capturing nuanced social aspects, terminology, and background details, and provides a foundation for future work in socially aware NLP and long-form conversational analysis.

Abstract

We focus on enhancing comprehension in small-group recorded conversations, which serve as a medium to bring people together and provide a space for sharing personal stories and experiences on crucial social matters. One way to parse and convey information from these conversations is by sharing highlighted excerpts in subsequent conversations. This can help promote a collective understanding of relevant issues, by highlighting perspectives and experiences to other groups of people who might otherwise be unfamiliar with and thus unable to relate to these experiences. The primary challenge that arises then is that excerpts taken from one conversation and shared in another setting might be missing crucial context or key elements that were previously introduced in the original conversation. This problem is exacerbated when conversations become lengthier and richer in themes and shared experiences. To address this, we explore how Large Language Models (LLMs) can enrich these excerpts by providing socially relevant context. We present approaches for effective contextualization to improve comprehension, readability, and empathy. We show significant improvements in understanding, as assessed through subjective and objective evaluations. While LLMs can offer valuable context, they struggle with capturing key social aspects. We release the Human-annotated Salient Excerpts (HSE) dataset to support future work. Additionally, we show how context-enriched excerpts can provide more focused and comprehensive conversation summaries.
Paper Structure (34 sections, 48 figures, 5 tables)

This paper contains 34 sections, 48 figures, 5 tables.

Figures (48)

  • Figure 1: Overview of Effective Contextualization of Salient Excerpts in Social Conversations: The figure illustrates the challenges posed when excerpts from longer social group conversations are shared without sufficient context. To address this, a large language model (LLM) is employed to generate context that incorporates key social attributes, aiming to enhance the comprehension of the excerpt, a process we term effective contextualization. This enhanced context is then evaluated through human judgment to gauge its usefulness and through objective measures of faithfulness to assess its accuracy.
  • Figure 2: Number of defined terms in each category
  • Figure 4: a) Length of conversation (in words) in Fora dataset and b) Length of excerpts (in words) in human-annotated excerpts dataset
  • Figure 5: GPT Prompts Comparison Survey Demographics
  • Figure 6: LLM Contexts Comparison Survey Demographics
  • ...and 43 more figures