Table of Contents
Fetching ...

ELLMA-T: an Embodied LLM-agent for Supporting English Language Learning in Social VR

Mengxu Pan, Alexandra Kitson, Hongyu Wan, Mirjana Prpa

TL;DR

ELLMA-T is developed, a design probe that integrates an LLM (GPT-4) with an ECA for English language learning in social VR (VRChat), informed by the situated learning framework, and reveals the potential of ELLMA-T to generate realistic, believable, and context-specific role plays for agent-learner interaction in VR.

Abstract

Many people struggle with learning a new language, with traditional tools falling short in providing contextualized learning tailored to each learner's needs. The recent development of large language models (LLMs) and embodied conversational agents (ECAs) in social virtual reality (VR) provide new opportunities to practice language learning in a contextualized and naturalistic way that takes into account the learner's language level and needs. To explore this opportunity, we developed ELLMA-T, an ECA that leverages an LLM (GPT-4) and situated learning framework for supporting learning English language in social VR (VRChat). Drawing on qualitative interviews (N=12), we reveal the potential of ELLMA-T to generate realistic, believable and context-specific role plays for agent-learner interaction in VR, and LLM's capability to provide initial language assessment and continuous feedback to learners. We provide five design implications for the future development of LLM-based language agents in social VR.

ELLMA-T: an Embodied LLM-agent for Supporting English Language Learning in Social VR

TL;DR

ELLMA-T is developed, a design probe that integrates an LLM (GPT-4) with an ECA for English language learning in social VR (VRChat), informed by the situated learning framework, and reveals the potential of ELLMA-T to generate realistic, believable, and context-specific role plays for agent-learner interaction in VR.

Abstract

Many people struggle with learning a new language, with traditional tools falling short in providing contextualized learning tailored to each learner's needs. The recent development of large language models (LLMs) and embodied conversational agents (ECAs) in social virtual reality (VR) provide new opportunities to practice language learning in a contextualized and naturalistic way that takes into account the learner's language level and needs. To explore this opportunity, we developed ELLMA-T, an ECA that leverages an LLM (GPT-4) and situated learning framework for supporting learning English language in social VR (VRChat). Drawing on qualitative interviews (N=12), we reveal the potential of ELLMA-T to generate realistic, believable and context-specific role plays for agent-learner interaction in VR, and LLM's capability to provide initial language assessment and continuous feedback to learners. We provide five design implications for the future development of LLM-based language agents in social VR.
Paper Structure (61 sections, 4 figures, 3 tables)

This paper contains 61 sections, 4 figures, 3 tables.

Figures (4)

  • Figure 1: Workflow of conversation tasks performed by ELLMA-T, including greeting the user, conducting language assessments, engaging in role-play scenarios, and providing feedback.
  • Figure 2: ELLMA-T in different virtual worlds within VRChat: an indoor café (left) and an outdoor city (right).
  • Figure 3: System Architecture of the ELLMA-T. The architecture highlights the core components and data flow within the system.
  • Figure 4: Structure of Separate Prompts for Different Tasks. This diagram illustrates how prompts are structured and separated for various tasks within the system. 1) The system prompt establishes the agent's persona across all interactions. 2) Task-specific prompts guide the agent during the introduction, language assessment, role-play, and feedback. 3) A decision prompt helps the agent determine when to transition between tasks. 4) The prompt for providing scaffolding during role-play conversations.