Enhancing Sign Language Teaching: A Mixed Reality Approach for Immersive Learning and Multi-Dimensional Feedback
Hongli Wen, Yang Xu, Lin Li, Xudong Ru, Xingce Wang, Zhongke Wu
TL;DR
This work tackles the lack of real-time, feedback-rich sign language teaching by integrating monocular 3D hand pose reconstruction, a ternary action-evaluation framework, and a mixed-reality scene-based classroom. It introduces a one-stage pose reconstruction method based on a module-perceiving transformer and a quaternion-based action embedding for robust student-teacher comparison, complemented by DTW-based alignment feedback. A three-part MR teaching pipeline reconstructs instructor motions from 2D videos into SMPLX-based avatars, stores standard teaching sequences, and delivers real-time feedback through confusion, smoothness, and alignment metrics. Experiments on a multimodal Chinese Sign Language dataset show high semantic accuracy, real-time performance, credible evaluation, and improved learning outcomes, highlighting the practical potential of immersive MR education for scalable, feedback-driven sign language teaching.
Abstract
Traditional sign language teaching methods face challenges such as limited feedback and diverse learning scenarios. Although 2D resources lack real-time feedback, classroom teaching is constrained by a scarcity of teacher. Methods based on VR and AR have relatively primitive interaction feedback mechanisms. This study proposes an innovative teaching model that uses real-time monocular vision and mixed reality technology. First, we introduce an improved hand-posture reconstruction method to achieve sign language semantic retention and real-time feedback. Second, a ternary system evaluation algorithm is proposed for a comprehensive assessment, maintaining good consistency with experts in sign language. Furthermore, we use mixed reality technology to construct a scenario-based 3D sign language classroom and explore the user experience of scenario teaching. Overall, this paper presents a novel teaching method that provides an immersive learning experience, advanced posture reconstruction, and precise feedback, achieving positive feedback on user experience and learning effectiveness.
