Analyzing Participants' Engagement during Online Meetings Using Unsupervised Remote Photoplethysmography with Behavioral Features
Alexander Vedernikov, Zhaodong Sun, Virpi-Liisa Kykyri, Mikko Pohjola, Miriam Nokia, Xiaobai Li
TL;DR
This work tackles measuring engagement in online meetings without physical sensors by extracting heart rate variability (HRV) signals from video using unsupervised remote photoplethysmography (rPPG) and fusing them with behavioral cues. Leveraging a novel Engagement Dataset collected in real-world group meetings, the authors reconstruct rPPG with Contrast-Phys, derive HRV features, and perform a two-stage engagement classification with feature-level fusion of HRV and facial/motion cues. Key findings show that HRV features alone can achieve high accuracy, reaching around 0.94 with 2–4 minute observation windows, and that incorporating behavioral features can boost accuracy to about 0.96, with BF4 and BF5 providing the strongest gains. The approach demonstrates practical, non-intrusive engagement estimation suitable for real-time monitoring and group-dynamics analysis, with potential applications in healthcare, education, and workplace stress assessment.
Abstract
Engagement measurement finds application in healthcare, education, services. The use of physiological and behavioral features is viable, but the impracticality of traditional physiological measurement arises due to the need for contact sensors. We demonstrate the feasibility of unsupervised remote photoplethysmography (rPPG) as an alternative for contact sensors in deriving heart rate variability (HRV) features, then fusing these with behavioral features to measure engagement in online group meetings. Firstly, a unique Engagement Dataset of online interactions among social workers is collected with granular engagement labels, offering insight into virtual meeting dynamics. Secondly, a pre-trained rPPG model is customized to reconstruct rPPG signals from video meetings in an unsupervised manner, enabling the calculation of HRV features. Thirdly, the feasibility of estimating engagement from HRV features using short observation windows, with a notable enhancement when using longer observation windows of two to four minutes, is demonstrated. Fourthly, the effectiveness of behavioral cues is evaluated when fused with physiological data, which further enhances engagement estimation performance. An accuracy of 94% is achieved when only HRV features are used, eliminating the need for contact sensors or ground truth signals; use of behavioral cues raises the accuracy to 96%. Facial analysis offers precise engagement measurement, beneficial for future applications.
