Table of Contents
Fetching ...

Appropriateness of LLM-equipped Robotic Well-being Coach Language in the Workplace: A Qualitative Evaluation

Micol Spitale, Minja Axelsson, Hatice Gunes

TL;DR

The paper addresses the challenge of making LLM-generated language in robotic well-being coaches appropriate for workplace use. It adopts a qualitative, multi-method approach, deploying an LLM-enabled coach to 17 employees over four weeks, followed by interviews and a 1.5-hour focus group to solicit evaluations across seven scenario-based prompts. The key findings show that language should probe deep feelings, demonstrate empathy, and avoid premature assumptions to prevent bias and stereotyping, informing practical design guidelines for real-world deployment. These insights advance how robotic coaches can support workplace mental well-being with safer, more context-aware language.

Abstract

Robotic coaches have been recently investigated to promote mental well-being in various contexts such as workplaces and homes. With the widespread use of Large Language Models (LLMs), HRI researchers are called to consider language appropriateness when using such generated language for robotic mental well-being coaches in the real world. Therefore, this paper presents the first work that investigated the language appropriateness of robot mental well-being coach in the workplace. To this end, we conducted an empirical study that involved 17 employees who interacted over 4 weeks with a robotic mental well-being coach equipped with LLM-based capabilities. After the study, we individually interviewed them and we conducted a focus group of 1.5 hours with 11 of them. The focus group consisted of: i) an ice-breaking activity, ii) evaluation of robotic coach language appropriateness in various scenarios, and iii) listing shoulds and shouldn'ts for designing appropriate robotic coach language for mental well-being. From our qualitative evaluation, we found that a language-appropriate robotic coach should (1) ask deep questions which explore feelings of the coachees, rather than superficial questions, (2) express and show emotional and empathic understanding of the context, and (3) not make any assumptions without clarifying with follow-up questions to avoid bias and stereotyping. These results can inform the design of language-appropriate robotic coach to promote mental well-being in real-world contexts.

Appropriateness of LLM-equipped Robotic Well-being Coach Language in the Workplace: A Qualitative Evaluation

TL;DR

The paper addresses the challenge of making LLM-generated language in robotic well-being coaches appropriate for workplace use. It adopts a qualitative, multi-method approach, deploying an LLM-enabled coach to 17 employees over four weeks, followed by interviews and a 1.5-hour focus group to solicit evaluations across seven scenario-based prompts. The key findings show that language should probe deep feelings, demonstrate empathy, and avoid premature assumptions to prevent bias and stereotyping, informing practical design guidelines for real-world deployment. These insights advance how robotic coaches can support workplace mental well-being with safer, more context-aware language.

Abstract

Robotic coaches have been recently investigated to promote mental well-being in various contexts such as workplaces and homes. With the widespread use of Large Language Models (LLMs), HRI researchers are called to consider language appropriateness when using such generated language for robotic mental well-being coaches in the real world. Therefore, this paper presents the first work that investigated the language appropriateness of robot mental well-being coach in the workplace. To this end, we conducted an empirical study that involved 17 employees who interacted over 4 weeks with a robotic mental well-being coach equipped with LLM-based capabilities. After the study, we individually interviewed them and we conducted a focus group of 1.5 hours with 11 of them. The focus group consisted of: i) an ice-breaking activity, ii) evaluation of robotic coach language appropriateness in various scenarios, and iii) listing shoulds and shouldn'ts for designing appropriate robotic coach language for mental well-being. From our qualitative evaluation, we found that a language-appropriate robotic coach should (1) ask deep questions which explore feelings of the coachees, rather than superficial questions, (2) express and show emotional and empathic understanding of the context, and (3) not make any assumptions without clarifying with follow-up questions to avoid bias and stereotyping. These results can inform the design of language-appropriate robotic coach to promote mental well-being in real-world contexts.
Paper Structure (23 sections, 2 figures)

This paper contains 23 sections, 2 figures.

Figures (2)

  • Figure 1: Word cloud of the adjectives listed by the employees to describe the robotic coach.
  • Figure 2: List of shoulds (green post-it on the left-side) and shouldn'ts (orange post-it on the right-side) identified by the employees regarding the robotic coaches' appropriate behaviours. Participants' initials on the post-its are anonymized using black boxes.