Table of Contents
Fetching ...

Learning Nonverbal Cues in Multiparty Social Interactions for Robotic Facilitators

Antonio Lech Martin-Ozimek, Isuru Jayarathne, Su Larb Mon, Jouhyeong Chew

TL;DR

The paper tackles generating nonverbal cues, specifically gaze, for robotic facilitators in multiparty social interactions, addressing the gap left by purely language-based facilitators. It adapts Implicit Behavior Cloning (IBC) to a gaze-generation task using a real multiparty gaze dataset, and compares it to explicit MSE BC across session types, evaluating with ASM, $R^2$, and SPARC. The contributions include replicating and extending the IBC approach to nonverbal cues and introducing a gaze-generation model suitable for social interaction settings, with results indicating IBC delivers smoother and more natural gaze trajectories than MSE. The work demonstrates the potential for robots to engage in complex human-robot interactions without a human facilitator, with implications for education and social robotics where nonverbal communication is crucial.

Abstract

Conventional behavior cloning (BC) models often struggle to replicate the subtleties of human actions. Previous studies have attempted to address this issue through the development of a new BC technique: Implicit Behavior Cloning (IBC). This new technique consistently outperformed the conventional Mean Squared Error (MSE) BC models in a variety of tasks. Our goal is to replicate the performance of the IBC model by Florence [in Proceedings of the 5th Conference on Robot Learning, 164:158-168, 2022], for social interaction tasks using our custom dataset. While previous studies have explored the use of large language models (LLMs) for enhancing group conversations, they often overlook the significance of non-verbal cues, which constitute a substantial part of human communication. We propose using IBC to replicate nonverbal cues like gaze behaviors. The model is evaluated against various types of facilitator data and compared to an explicit, MSE BC model. Results show that the IBC model outperforms the MSE BC model across session types using the same metrics used in the previous IBC paper. Despite some metrics showing mixed results which are explainable for the custom dataset for social interaction, we successfully replicated the IBC model to generate nonverbal cues. Our contributions are (1) the replication and extension of the IBC model, and (2) a nonverbal cues generation model for social interaction. These advancements facilitate the integration of robots into the complex interactions between robots and humans, e.g., in the absence of a human facilitator.

Learning Nonverbal Cues in Multiparty Social Interactions for Robotic Facilitators

TL;DR

The paper tackles generating nonverbal cues, specifically gaze, for robotic facilitators in multiparty social interactions, addressing the gap left by purely language-based facilitators. It adapts Implicit Behavior Cloning (IBC) to a gaze-generation task using a real multiparty gaze dataset, and compares it to explicit MSE BC across session types, evaluating with ASM, , and SPARC. The contributions include replicating and extending the IBC approach to nonverbal cues and introducing a gaze-generation model suitable for social interaction settings, with results indicating IBC delivers smoother and more natural gaze trajectories than MSE. The work demonstrates the potential for robots to engage in complex human-robot interactions without a human facilitator, with implications for education and social robotics where nonverbal communication is crucial.

Abstract

Conventional behavior cloning (BC) models often struggle to replicate the subtleties of human actions. Previous studies have attempted to address this issue through the development of a new BC technique: Implicit Behavior Cloning (IBC). This new technique consistently outperformed the conventional Mean Squared Error (MSE) BC models in a variety of tasks. Our goal is to replicate the performance of the IBC model by Florence [in Proceedings of the 5th Conference on Robot Learning, 164:158-168, 2022], for social interaction tasks using our custom dataset. While previous studies have explored the use of large language models (LLMs) for enhancing group conversations, they often overlook the significance of non-verbal cues, which constitute a substantial part of human communication. We propose using IBC to replicate nonverbal cues like gaze behaviors. The model is evaluated against various types of facilitator data and compared to an explicit, MSE BC model. Results show that the IBC model outperforms the MSE BC model across session types using the same metrics used in the previous IBC paper. Despite some metrics showing mixed results which are explainable for the custom dataset for social interaction, we successfully replicated the IBC model to generate nonverbal cues. Our contributions are (1) the replication and extension of the IBC model, and (2) a nonverbal cues generation model for social interaction. These advancements facilitate the integration of robots into the complex interactions between robots and humans, e.g., in the absence of a human facilitator.
Paper Structure (21 sections, 6 equations, 1 figure, 3 tables)

This paper contains 21 sections, 6 equations, 1 figure, 3 tables.

Figures (1)

  • Figure 1: An overview of the pipeline for training both models.