Measuring Student Behavioral Engagement using Histogram of Actions
Ahmed Abdelkawy, Aly Farag, Islam Alkabbany, Asem Ali, Chris Foreman, Thomas Tretter, Nicholas Hindy
TL;DR
This paper addresses the challenge of measuring classroom behavioral engagement by automatically recognizing student actions from upper-body skeleton heatmaps with a 3D-CNN, constructing a histogram of action frequencies over 2-minute segments, and predicting engagement using a gaze-augmented feature fed into a random forest. It introduces an Actions Dictionary, applies transfer learning via PoseConv3D for classroom action recognition, and uses kernel density estimation to synthesize disengaged samples to balance the training data. The authors provide a new public-style dataset with 1414 action-annotated segments (13 actions) and 112 engagement-annotated segments, achieving top-1 action accuracy of 86.32% and a weighted F1 around 0.90 for engagement when using synthesized samples. Overall, the method yields interpretable, scalable engagement estimates and demonstrates noticeable improvement over prior baselines, offering practical potential for instructor feedback and classroom analytics.
Abstract
In this paper, we propose a novel technique for measuring behavioral engagement through students' actions recognition. The proposed approach recognizes student actions then predicts the student behavioral engagement level. For student action recognition, we use human skeletons to model student postures and upper body movements. To learn the dynamics of student upper body, a 3D-CNN model is used. The trained 3D-CNN model is used to recognize actions within every 2minute video segment then these actions are used to build a histogram of actions which encodes the student actions and their frequencies. This histogram is utilized as an input to SVM classifier to classify whether the student is engaged or disengaged. To evaluate the proposed framework, we build a dataset consisting of 1414 2-minute video segments annotated with 13 actions and 112 video segments annotated with two engagement levels. Experimental results indicate that student actions can be recognized with top 1 accuracy 83.63% and the proposed framework can capture the average engagement of the class.
