Student Classroom Behavior Detection based on Improved YOLOv7
Fan Yang
TL;DR
This paper tackles the challenge of detecting student classroom behaviors in videos, where crowded scenes and occlusions hinder accuracy. It introduces the SCB-Dataset (4.2k images, 18.4k labels, three behaviors: hand-raising, reading, writing) and enhances a one-stage detector by integrating Bi-Level Routing Attention (BRA) and Wise-IoU into YOLOv7. The proposed YOLOv7+BRA and YOLOv7+Wise-IoU (with variants v1–v3) yield improved metrics, achieving a mean average precision at IoU 0.5 of about 79%, and outperforming the baseline in precision and mAP@0.5:0.95. These advances enable more accurate, real-time monitoring of classroom behavior, with broad implications for teaching effectiveness and educational analytics.
Abstract
Accurately detecting student behavior in classroom videos can aid in analyzing their classroom performance and improving teaching effectiveness. However, the current accuracy rate in behavior detection is low. To address this challenge, we propose the Student Classroom Behavior Detection method, based on improved YOLOv7. First, we created the Student Classroom Behavior dataset (SCB-Dataset), which includes 18.4k labels and 4.2k images, covering three behaviors: hand raising, reading, and writing. To improve detection accuracy in crowded scenes, we integrated the biformer attention module and Wise-IoU into the YOLOv7 network. Finally, experiments were conducted on the SCB-Dataset, and the model achieved an mAP@0.5 of 79%, resulting in a 1.8% improvement over previous results. The SCB-Dataset and code are available for download at: https://github.com/Whiffe/SCB-dataset.
