Toward Scalable Co-located Practical Learning: Assisting with Computer Vision and Multimodal Analytics

Xinyu Li; Linxuan Zhao; Roberto Martinez-Maldonado; Dragan Gasevic; Lixiang Yan

Toward Scalable Co-located Practical Learning: Assisting with Computer Vision and Multimodal Analytics

Xinyu Li, Linxuan Zhao, Roberto Martinez-Maldonado, Dragan Gasevic, Lixiang Yan

Abstract

This study examined whether a single ceiling-mounted camera could be used to capture fine-grained learning behaviours in co-located practical learning. In undergraduate nursing simulations, teachers first identified seven observable behaviour categories, which were then used to train a YOLO-based detector. Video data were collected from 52 sessions, and analyses focused on Scenario A because it produced greater behavioural variation than Scenario B. Annotation reliability was high (F1=0.933). On the held-out test set, the model achieved a precision of 0.789, a recall of 0.784, and an mAP@0.5 of 0.827. When only behaviour frequencies were compared, no robust differences were found between high- and low-performing groups. However, when behaviour labels were analysed together with spatial context, clear differences emerged in both task and collaboration performance. Higher-performing teams showed more patient interaction in the primary work area, whereas lower-performing teams showed more phone-related activity and more activity in secondary areas. These findings suggest that behavioural data are more informative when interpreted together with where they occur. Overall, the study shows that a single-camera computer vision approach can support the analysis of teamwork and task engagement in face-to-face practical learning without relying on wearable sensors.

Toward Scalable Co-located Practical Learning: Assisting with Computer Vision and Multimodal Analytics

Abstract

Paper Structure (31 sections, 11 figures, 5 tables)

This paper contains 31 sections, 11 figures, 5 tables.

Introduction
Background
Online Versus In-Person Learning Analytics
Multimodal Learning Analytics Approaches
Advances in Computer Vision
Computer Vision for Learning Behaviours Detection
Learning Context Infusion
Computer Vision Model Creation
Computer Vision Deployment
Research Questions
Methods
Learning Context
Apparatus and Data Collection
Ethics and Privacy Protection Measures
Learning Behaviours
...and 16 more sections

Figures (11)

Figure 1: Computer vision approach for learning behaviours detection
Figure 2: Teams of four students and a teacher playing the role of the patient in Bed 3’s family member in the specialised classroom space. The points and labels represent the centre of different spaces of interest. The tracking boxes identify individuals and learning actions happened in the simulation classroom. The black area inside the tracking boxes protects individuals’ facial identities.
Figure 3: Model Training Results
Figure 4: Confusion Matrix (Normalised)
Figure 5: Precision Recall Curve
...and 6 more figures

Toward Scalable Co-located Practical Learning: Assisting with Computer Vision and Multimodal Analytics

Abstract

Toward Scalable Co-located Practical Learning: Assisting with Computer Vision and Multimodal Analytics

Authors

Abstract

Table of Contents

Figures (11)