Improving Emotional Support Delivery in Text-Based Community Safety Reporting Using Large Language Models

Yiren Liu; Yerong Li; Ryan Mayfield; Yun Huang

Improving Emotional Support Delivery in Text-Based Community Safety Reporting Using Large Language Models

Yiren Liu, Yerong Li, Ryan Mayfield, Yun Huang

TL;DR

A fine-tuned Large Language Model, named dispatcherLLM, designed to suggest replies through simulating human dispatchers' languages with appropriate emotional support is developed and implemented, demonstrating the significant potential of generative AI in improving service delivery.

Abstract

Emotional support is a crucial aspect of communication between community members and police dispatchers during incident reporting. However, there is a lack of understanding about how emotional support is delivered through text-based systems, especially in various non-emergency contexts. In this study, we analyzed two years of chat logs comprising 57,114 messages across 8,239 incidents from 130 higher education institutions. Our empirical findings revealed significant variations in emotional support provided by dispatchers, influenced by the type of incident, service time, and a noticeable decline in support over time across multiple organizations. To improve the consistency and quality of emotional support, we developed and implemented a fine-tuned Large Language Model (LLM), named dispatcherLLM. We evaluated dispatcherLLM by comparing its generated responses to those of human dispatchers and other off-the-shelf models using real chat messages. Additionally, we conducted a human evaluation to assess the perceived effectiveness of the support provided by dispatcherLLM. This study not only contributes new empirical understandings of emotional support in text-based dispatch systems but also demonstrates the significant potential of generative AI in improving service delivery.

Improving Emotional Support Delivery in Text-Based Community Safety Reporting Using Large Language Models

TL;DR

Abstract

Paper Structure (41 sections, 1 equation, 9 figures, 4 tables)

This paper contains 41 sections, 1 equation, 9 figures, 4 tables.

Introduction
Related Work
The Role of Emotional Support in Existing Safety Incident Reporting Systems
Community Risk Reporting Systems
Emotional Supports with Conversational Systems
Research Questions
Method
Dataset
Data ethics
Data cleaning and pre-processing
Detecting Emotions Involved in the Chats
Classifying emotions with RoBERTa and identifying emotional support
Measuring conversational emotion polarity
Measuring Information Collection through Event Argument Extraction
Event ontology
...and 26 more sections

Figures (9)

Figure 1: The user interface of the LiveSafe mobile app; a) The user needs to select a tip category after choosing to submit a tip; b) After the tip category is chosen, the user can submit text and image descriptions and connect to a human agent; c) The dispatcher can connect and chat with the user using the Command Dashboard.
Figure 2: Average negativity score and total tip count by tip category. ANOVA revealed a significant difference ($F(17, 8221) = [4.65], p < 0.001^{***}$) in user's emotional polarity across incident categories during reporting.
Figure 3: Change in the sentiment ratio of users' utterances throughout conversations: beginning refers to the first 1/3 of each conversation, middle refers to the 2/3 of each conversation, and end refers to the last 1/3 of each conversation. The t-tests indicate a significant increase in users' positive sentiment as their conversations with dispatchers progressed.
Figure 4: Our dispatcherLLM is used to automatically generate responses, given a user message. The left side of the figure shows an actual chat log between a user and a dispatcher, regarding a Drugs / Alcohol incident. On the right, the figure displays a response generated by our dispatcherLLM. This illustration serves not only to compare the two types of responses but also to demonstrate a potential use case scenario in which dispatcherLLM suggests responses to human dispatchers, thereby enhancing the quality of their replies.
Figure 5: A randomly selected Harassment/Abuse incident was showcased in the user survey. The follow-up question generated by dispatcherLLM was perceived by participants to be more contextualized and specific than GPT-3.5's response.
...and 4 more figures

Improving Emotional Support Delivery in Text-Based Community Safety Reporting Using Large Language Models

TL;DR

Abstract

Improving Emotional Support Delivery in Text-Based Community Safety Reporting Using Large Language Models

Authors

TL;DR

Abstract

Table of Contents

Figures (9)