Table of Contents
Fetching ...

Speech Is Not Enough: Interpreting Nonverbal Indicators of Common Knowledge and Engagement

Derek Palmer, Yifan Zhu, Kenneth Lai, Hannah VanderHoeven, Mariah Bradford, Ibrahim Khebour, Carlos Mabrey, Jack Fitzgerald, Nikhil Krishnaswamy, Martha Palmer, James Pustejovsky

TL;DR

This work tackles the problem of supporting group problem solving by interpreting nonverbal indicators of knowledge state and engagement in multimodal ways. It proposes a pipeline that fuses gaze, posture, gestures, and object-level cues to complement imperfect automatic speech recognition in classroom-style interactions, demonstrated in two scenarios: a Fibonacci weights knowledge task and a simulated classroom planning activity. The key contributions include a real-time nonverbal analysis framework and demonstrations of knowledge support and social cohesion detection in 3-person groups, highlighting Dominated Discussion as a detectable negative dynamic. The approach aims to generalize to AI Partners across domains and settings with portable object detection and reduced annotation burden, enabling effective 3+ person collaboration in education, business, and other contexts.

Abstract

Our goal is to develop an AI Partner that can provide support for group problem solving and social dynamics. In multi-party working group environments, multimodal analytics is crucial for identifying non-verbal interactions of group members. In conjunction with their verbal participation, this creates an holistic understanding of collaboration and engagement that provides necessary context for the AI Partner. In this demo, we illustrate our present capabilities at detecting and tracking nonverbal behavior in student task-oriented interactions in the classroom, and the implications for tracking common ground and engagement.

Speech Is Not Enough: Interpreting Nonverbal Indicators of Common Knowledge and Engagement

TL;DR

This work tackles the problem of supporting group problem solving by interpreting nonverbal indicators of knowledge state and engagement in multimodal ways. It proposes a pipeline that fuses gaze, posture, gestures, and object-level cues to complement imperfect automatic speech recognition in classroom-style interactions, demonstrated in two scenarios: a Fibonacci weights knowledge task and a simulated classroom planning activity. The key contributions include a real-time nonverbal analysis framework and demonstrations of knowledge support and social cohesion detection in 3-person groups, highlighting Dominated Discussion as a detectable negative dynamic. The approach aims to generalize to AI Partners across domains and settings with portable object detection and reduced annotation burden, enabling effective 3+ person collaboration in education, business, and other contexts.

Abstract

Our goal is to develop an AI Partner that can provide support for group problem solving and social dynamics. In multi-party working group environments, multimodal analytics is crucial for identifying non-verbal interactions of group members. In conjunction with their verbal participation, this creates an holistic understanding of collaboration and engagement that provides necessary context for the AI Partner. In this demo, we illustrate our present capabilities at detecting and tracking nonverbal behavior in student task-oriented interactions in the classroom, and the implications for tracking common ground and engagement.

Paper Structure

This paper contains 6 sections, 2 figures.

Figures (2)

  • Figure 1: Object detection in Weights Task
  • Figure 2: Contrasting engagement levels in simulated project planning