Table of Contents
Fetching ...

Video2MR: Automatically Generating Mixed Reality 3D Instructions by Augmenting Extracted Motion from 2D Videos

Keiichi Ihara, Kyzyl Monteiro, Mehrad Faridan, Rubaiat Habib Kazi, Ryo Suzuki

TL;DR

Video2MR automatically converts 2D instructional videos into MR 3D avatars and augments them with four techniques—pose comparison, motion visualization, embodied temporal navigation, and avatar repositioning—to create engaging, scalable MR sports and exercise instructions. It leverages DeepMotion for motion extraction and uses Hololens2 with Azure Kinect to deliver real-time feedback and visualization, including a pose-match indicator and 3D gaze/trajectory cues. Through a formative prototype, a user study (n=12) and expert reviews (n=6) across multiple sports, the system demonstrates increased co-presence, engagement, and fun compared with 2D videos, while highlighting challenges in tracking accuracy, first-person visibility, and equipment integration. The work suggests concrete future enhancements in object interaction, verbal guidance, and richer avatar realism to further broaden applicability and effectiveness in automated MR instruction.

Abstract

This paper introduces Video2MR, a mixed reality system that automatically generates 3D sports and exercise instructions from 2D videos. Mixed reality instructions have great potential for physical training, but existing works require substantial time and cost to create these 3D experiences. Video2MR overcomes this limitation by transforming arbitrary instructional videos available online into MR 3D avatars with AI-enabled motion capture (DeepMotion). Then, it automatically enhances the avatar motion through the following augmentation techniques: 1) contrasting and highlighting differences between the user and avatar postures, 2) visualizing key trajectories and movements of specific body parts, 3) manipulation of time and speed using body motion, and 4) spatially repositioning avatars for different perspectives. Developed on Hololens 2 and Azure Kinect, we showcase various use cases, including yoga, dancing, soccer, tennis, and other physical exercises. The study results confirm that Video2MR provides more engaging and playful learning experiences, compared to existing 2D video instructions.

Video2MR: Automatically Generating Mixed Reality 3D Instructions by Augmenting Extracted Motion from 2D Videos

TL;DR

Video2MR automatically converts 2D instructional videos into MR 3D avatars and augments them with four techniques—pose comparison, motion visualization, embodied temporal navigation, and avatar repositioning—to create engaging, scalable MR sports and exercise instructions. It leverages DeepMotion for motion extraction and uses Hololens2 with Azure Kinect to deliver real-time feedback and visualization, including a pose-match indicator and 3D gaze/trajectory cues. Through a formative prototype, a user study (n=12) and expert reviews (n=6) across multiple sports, the system demonstrates increased co-presence, engagement, and fun compared with 2D videos, while highlighting challenges in tracking accuracy, first-person visibility, and equipment integration. The work suggests concrete future enhancements in object interaction, verbal guidance, and richer avatar realism to further broaden applicability and effectiveness in automated MR instruction.

Abstract

This paper introduces Video2MR, a mixed reality system that automatically generates 3D sports and exercise instructions from 2D videos. Mixed reality instructions have great potential for physical training, but existing works require substantial time and cost to create these 3D experiences. Video2MR overcomes this limitation by transforming arbitrary instructional videos available online into MR 3D avatars with AI-enabled motion capture (DeepMotion). Then, it automatically enhances the avatar motion through the following augmentation techniques: 1) contrasting and highlighting differences between the user and avatar postures, 2) visualizing key trajectories and movements of specific body parts, 3) manipulation of time and speed using body motion, and 4) spatially repositioning avatars for different perspectives. Developed on Hololens 2 and Azure Kinect, we showcase various use cases, including yoga, dancing, soccer, tennis, and other physical exercises. The study results confirm that Video2MR provides more engaging and playful learning experiences, compared to existing 2D video instructions.
Paper Structure (56 sections, 16 figures)