Table of Contents
Fetching ...

ClassComet: Exploring and Designing AI-generated Danmaku in Educational Videos to Enhance Online Learning

Zipeng Ji, Pengcheng An, Jian Zhao

TL;DR

This work tackles the challenge of scarce and variable-quality danmaku in educational videos by leveraging large multimodal models (LMMs) to automatically generate high-quality, classroom-relevant danmaku. It identifies desirable content- and emotion-related danmaku through a formative study and delivers ClassComet, a platform that uses clip- and text-level video understanding, virtual personas, and a structured prompt template to produce diverse danmaku. In a controlled study, AI-generated danmaku—especially when combining content- and emotion-related types—improved engagement and learning outcomes, with quality metrics approaching those of human-created danmaku on several dimensions. The results demonstrate the feasibility and value of AI-generated danmaku for scalable, consistent, and effective educational video experiences, while outlining limitations and future avenues such as real-time generation, personalized personas, and longitudinal effects.

Abstract

Danmaku, users' live comments synchronized with, and overlaying on videos, has recently shown potential in promoting online video-based learning. However, user-generated danmaku can be scarce-especially in newer or less viewed videos and its quality is unpredictable, limiting its educational impact. This paper explores how large multimodal models (LMM) can be leveraged to automatically generate effective, high-quality danmaku. We first conducted a formative study to identify the desirable characteristics of content- and emotion-related danmaku in educational videos. Based on the obtained insights, we developed ClassComet, an educational video platform with novel LMM-driven techniques for generating relevant types of danmaku to enhance video-based learning. Through user studies, we examined the quality of generated danmaku and their influence on learning experiences. The results indicate that our generated danmaku is comparable to human-created ones, and videos with both content- and emotion-related danmaku showed significant improvement in viewers' engagement and learning outcome.

ClassComet: Exploring and Designing AI-generated Danmaku in Educational Videos to Enhance Online Learning

TL;DR

This work tackles the challenge of scarce and variable-quality danmaku in educational videos by leveraging large multimodal models (LMMs) to automatically generate high-quality, classroom-relevant danmaku. It identifies desirable content- and emotion-related danmaku through a formative study and delivers ClassComet, a platform that uses clip- and text-level video understanding, virtual personas, and a structured prompt template to produce diverse danmaku. In a controlled study, AI-generated danmaku—especially when combining content- and emotion-related types—improved engagement and learning outcomes, with quality metrics approaching those of human-created danmaku on several dimensions. The results demonstrate the feasibility and value of AI-generated danmaku for scalable, consistent, and effective educational video experiences, while outlining limitations and future avenues such as real-time generation, personalized personas, and longitudinal effects.

Abstract

Danmaku, users' live comments synchronized with, and overlaying on videos, has recently shown potential in promoting online video-based learning. However, user-generated danmaku can be scarce-especially in newer or less viewed videos and its quality is unpredictable, limiting its educational impact. This paper explores how large multimodal models (LMM) can be leveraged to automatically generate effective, high-quality danmaku. We first conducted a formative study to identify the desirable characteristics of content- and emotion-related danmaku in educational videos. Based on the obtained insights, we developed ClassComet, an educational video platform with novel LMM-driven techniques for generating relevant types of danmaku to enhance video-based learning. Through user studies, we examined the quality of generated danmaku and their influence on learning experiences. The results indicate that our generated danmaku is comparable to human-created ones, and videos with both content- and emotion-related danmaku showed significant improvement in viewers' engagement and learning outcome.

Paper Structure

This paper contains 60 sections, 9 figures, 2 tables.

Figures (9)

  • Figure 1: A sample screenshot of an educational video with danmaku. The translations of the displayed danmaku are: 1. "I didn't linked it to equations before", 2. "fill the plane!!!", 3. "This chalk and eraser are so useful", 4. "Saying others are pretending to understand, lol", 5. "Wow, this is magic", 6. "29 people staying up late", 7. "Watch MIT's linear algebra class", 8. "Vector space?".
  • Figure 2: User interface of ClassComet: (A) video control buttons for play/pause, volume adjustment, and speed control, (B) danmaku embedded in the video, (C) danmaku control that contains danmaku settings and an input box to send danmaku, and (D) a video sidebar for selecting other educational videos.
  • Figure 3: Examples of different types of danmaku (including both content-related and emotion-related) generated by ClassComet in educational videos.
  • Figure 4: The ClassComet pipeline for automatically generating danmaku in educational videos includes four steps: (A) Extract video information, segment the video into clips using scene detection, and generate descriptions at both the text- and clip-level; (B) Create virtual personas; (C) Design a structured danmaku prompt template and set parameters; and (D) Embed generated danmaku into educational videos.
  • Figure 5: The structured prompt template of generating danmaku which consists of the system and user prompts as well as the pass parameters including two video descriptions (text and clip levels) and personas.
  • ...and 4 more figures