VIVID: Human-AI Collaborative Authoring of Vicarious Dialogues from Lecture Videos

Seulgi Choi; Hyewon Lee; Yoonjoo Lee; Juho Kim

VIVID: Human-AI Collaborative Authoring of Vicarious Dialogues from Lecture Videos

Seulgi Choi, Hyewon Lee, Yoonjoo Lee, Juho Kim

TL;DR

This work tackles disengagement in long online lectures by proposing five design guidelines to convert monologue videos into pedagogically meaningful vicarious dialogues and by delivering VIVID, a collaborative system where instructors co-design dialogues with LLMs. Through design workshops and a within-subject study (N=12), the authors demonstrate that VIVID enables more efficient dialogue authoring and yields higher-quality, dynamically patterned dialogues that are cognitively accessible and immersive for learners. The evaluation includes both user studies and technical assessments of prompting pipelines, showing that end-to-end dialogue authoring with VIVID improves metrics related to dynamism, immersion, and metacognitive engagement, while also highlighting areas for explainability and verbosity improvements. Overall, VIVID presents a scalable, instructor-centered workflow for generating high-quality educational dialogues from lecture videos, with potential applicability across languages, subjects, and learning contexts.

Abstract

The lengthy monologue-style online lectures cause learners to lose engagement easily. Designing lectures in a "vicarious dialogue" format can foster learners' cognitive activities more than monologue-style. However, designing online lectures in a dialogue style catered to the diverse needs of learners is laborious for instructors. We conducted a design workshop with eight educational experts and seven instructors to present key guidelines and the potential use of large language models (LLM) to transform a monologue lecture script into pedagogically meaningful dialogue. Applying these design guidelines, we created VIVID which allows instructors to collaborate with LLMs to design, evaluate, and modify pedagogical dialogues. In a within-subjects study with instructors (N=12), we show that VIVID helped instructors select and revise dialogues efficiently, thereby supporting the authoring of quality dialogues. Our findings demonstrate the potential of LLMs to assist instructors with creating high-quality educational dialogues across various learning stages.

VIVID: Human-AI Collaborative Authoring of Vicarious Dialogues from Lecture Videos

TL;DR

Abstract

Paper Structure (57 sections, 18 figures, 7 tables)

This paper contains 57 sections, 18 figures, 7 tables.

Introduction
Related Work
Simulating Vicarious Learning in an Online Learning Environment.
Generating Diverse Educational Dialogues for Vicarious Learners at Scale
Design Workshop
Utterance Patterns and Teaching Strategies
Key utterance categories that are commonly observed in 1-to-1 tutoring and classroom.
Three teaching strategies that can positively affect vicarious learners.
Workshop Overview
Findings from Design Workshop
Challenges in Converting Video Lectures to Dialogue
Design recommendations that should be considered while designing dialogue for vicarious learners.
Enhancing the Educational Dialogue Design Process with LLMs
Utilization of LLM-Generated Dialogues
Criteria for Evaluating the Educational Dialogues
...and 42 more sections

Figures (18)

Figure 1: VIVID's key components of Initial Generation : (A1) User uploads lecture video; (A2) User trims a video section to convert ; (B1) User uses the highlighting feature by selecting a part of the video transcript, where vicarious learners may face difficulty understanding ; (B2) User writes down the learning context and the scenario of dialogue that they want to depict in final dialogue, and Comparison and Selection phase : (C1) VIVID shows a rubric table of learners' understanding level regarding key concepts stated in the transcript; (C2) VIVID presents generated dialogues in the form of dialogue cards comprising of core information from each dialogue.
Figure 2: VIVID's key components of Refinement phase : (D1) User can edit each utterance content directly or using basic editing tools; (D2) User can use laboratory feature by selecting consecutive utterances and clicking (D4-1) laboratory button ; (D3) VIVID suggests four variations of sub-dialogues as a result; (D4-2) apply button ; User can replace the original utterances with a variation by clicking button.
Figure 3: Overview of prompting pipeline for Initial Generation phase. Each step corresponds to following subsections: (1) Create a rubric for highlighted areas, indicating the learner's understanding level for each concept ; (2) Determine the direct learner's understanding level using the highlighted parts and the rubric ; (3) Create an answer sheet consisting of the learner’s expected answers to the tutor’s questions and questions showing where the learner struggles ; (4) Generate dialogues based on the guidelines.
Figure 4: Example of generated dialogue regardless of the prerequisite relationships between key concepts. Concept A is a prerequisite for Concept B. During the conversation, the direct learner didn't understand the Concept A initially, but grasped it through question-and-answer, and answered Concept B correctly later.
Figure 5: Initial Generation pipeline. (a) Understanding level: Example of the direct learner's understanding level using the highlighted parts and the rubric, (b) Answer Sheet and Questions: Example of the answer sheet consisting of learner's expected answers to the tutor's questions and expected questions of direct learner, (3) Generated Dialogue: Example of final dialogue based our guideline-based prompt. The green box shows how the concept that set in level 1 reflects on the final dialogue.
...and 13 more figures

VIVID: Human-AI Collaborative Authoring of Vicarious Dialogues from Lecture Videos

TL;DR

Abstract

VIVID: Human-AI Collaborative Authoring of Vicarious Dialogues from Lecture Videos

Authors

TL;DR

Abstract

Table of Contents

Figures (18)