Table of Contents
Fetching ...

Long-term Human Participation Assessment In Collaborative Learning Environments Using Dynamic Scene Analysis

Wenjing Shi, Phuong Tran, Sylvia Celedón-Pattichis, Marios S. Pattichis

TL;DR

This work addresses the challenge of long-term participation assessment in real-world collaborative classrooms. It introduces AOLME datasets and a two-stage approach comprising group detection near the camera and dynamic participant tracking within the detected group, enabling robust analysis despite occlusion and movement. Key contributions include a multi-representation group detector blending YOLO with AM-FM features, a video-face-recognition workflow using sparse sampling and K-means clustering to handle varied poses, and a Dynamic Participant Tracking (DPT) finite-state machine that maintains accurate presence information over long sessions. With strong results—F1=0.85 for group detection on AOLME-GT and 82.3% accuracy for DPT on long videos—the paper demonstrates practical viability for educational data analytics and supports visualization via participation maps.

Abstract

The paper develops datasets and methods to assess student participation in real-life collaborative learning environments. In collaborative learning environments, students are organized into small groups where they are free to interact within their group. Thus, students can move around freely causing issues with strong pose variation, move out and re-enter the camera scene, or face away from the camera. We formulate the problem of assessing student participation into two subproblems: (i) student group detection against strong background interference from other groups, and (ii) dynamic participant tracking within the group. A massive independent testing dataset of 12,518,250 student label instances, of total duration of 21 hours and 22 minutes of real-life videos, is used for evaluating the performance of our proposed method for student group detection. The proposed method of using multiple image representations is shown to perform equally or better than YOLO on all video instances. Over the entire dataset, the proposed method achieved an F1 score of 0.85 compared to 0.80 for YOLO. Following student group detection, the paper presents the development of a dynamic participant tracking system for assessing student group participation through long video sessions. The proposed dynamic participant tracking system is shown to perform exceptionally well, missing a student in just one out of 35 testing videos. In comparison, a state of the art method fails to track students in 14 out of the 35 testing videos. The proposed method achieves 82.3% accuracy on an independent set of long, real-life collaborative videos.

Long-term Human Participation Assessment In Collaborative Learning Environments Using Dynamic Scene Analysis

TL;DR

This work addresses the challenge of long-term participation assessment in real-world collaborative classrooms. It introduces AOLME datasets and a two-stage approach comprising group detection near the camera and dynamic participant tracking within the detected group, enabling robust analysis despite occlusion and movement. Key contributions include a multi-representation group detector blending YOLO with AM-FM features, a video-face-recognition workflow using sparse sampling and K-means clustering to handle varied poses, and a Dynamic Participant Tracking (DPT) finite-state machine that maintains accurate presence information over long sessions. With strong results—F1=0.85 for group detection on AOLME-GT and 82.3% accuracy for DPT on long videos—the paper demonstrates practical viability for educational data analytics and supports visualization via participation maps.

Abstract

The paper develops datasets and methods to assess student participation in real-life collaborative learning environments. In collaborative learning environments, students are organized into small groups where they are free to interact within their group. Thus, students can move around freely causing issues with strong pose variation, move out and re-enter the camera scene, or face away from the camera. We formulate the problem of assessing student participation into two subproblems: (i) student group detection against strong background interference from other groups, and (ii) dynamic participant tracking within the group. A massive independent testing dataset of 12,518,250 student label instances, of total duration of 21 hours and 22 minutes of real-life videos, is used for evaluating the performance of our proposed method for student group detection. The proposed method of using multiple image representations is shown to perform equally or better than YOLO on all video instances. Over the entire dataset, the proposed method achieved an F1 score of 0.85 compared to 0.80 for YOLO. Following student group detection, the paper presents the development of a dynamic participant tracking system for assessing student group participation through long video sessions. The proposed dynamic participant tracking system is shown to perform exceptionally well, missing a student in just one out of 35 testing videos. In comparison, a state of the art method fails to track students in 14 out of the 35 testing videos. The proposed method achieves 82.3% accuracy on an independent set of long, real-life collaborative videos.
Paper Structure (38 sections, 1 equation, 14 figures, 11 tables, 1 algorithm)

This paper contains 38 sections, 1 equation, 14 figures, 11 tables, 1 algorithm.

Figures (14)

  • Figure 1: Examples of the challenges associated with developing methods for assessing student participation based on the AOLME datasets.
  • Figure 2: A simple example to demonstrate the issues for training and testing dynamic participant tracking. In this example, we only show annotation for a single student per image. We note that there is no bounding box for the student in (d) because he is not visible. For the training and testing datasets, in each frame, we mark all of the students for each group.
  • Figure 3: AOLME student participation analysis system. We detect groups every second. We perform face recognition and dynamic participant tracking every frame.
  • Figure 4: Student Group Detection System.
  • Figure 5: AM-FM representation of the classroom environment.
  • ...and 9 more figures