Table of Contents
Fetching ...

Embedding Large Language Models into Extended Reality: Opportunities and Challenges for Inclusion, Engagement, and Privacy

Efe Bozkir, Süleyman Özdel, Ka Hei Carrie Lau, Mengdi Wang, Hong Gao, Enkelejda Kasneci

TL;DR

This paper addresses the limitation of scripted NPCs in XR and explores embedding large language models (LLMs) into XR spaces as avatars or narratives to enhance inclusion, diversity, and user engagement. It outlines a practical pipeline combining speech-to-text, text-to-speech, and multimodal LLMs with prompt engineering and fine-tuning to support open-ended dialogue in XR. The authors discuss opportunities across inclusion, engagement, and privacy, while outlining challenges such as hallucinations, latency, data storage, and potential privacy invasions from integrating LLM interactions with biometric XR data, advocating privacy-aware design and user-centered controls. The work contributes a research agenda for evaluating privacy attitudes, developing adaptive and inclusive XR interactions, and guiding future empirical studies.

Abstract

Advances in artificial intelligence and human-computer interaction will likely lead to extended reality (XR) becoming pervasive. While XR can provide users with interactive, engaging, and immersive experiences, non-player characters are often utilized in pre-scripted and conventional ways. This paper argues for using large language models (LLMs) in XR by embedding them in avatars or as narratives to facilitate inclusion through prompt engineering and fine-tuning the LLMs. We argue that this inclusion will promote diversity for XR use. Furthermore, the versatile conversational capabilities of LLMs will likely increase engagement in XR, helping XR become ubiquitous. Lastly, we speculate that combining the information provided to LLM-powered spaces by users and the biometric data obtained might lead to novel privacy invasions. While exploring potential privacy breaches, examining user privacy concerns and preferences is also essential. Therefore, despite challenges, LLM-powered XR is a promising area with several opportunities.

Embedding Large Language Models into Extended Reality: Opportunities and Challenges for Inclusion, Engagement, and Privacy

TL;DR

This paper addresses the limitation of scripted NPCs in XR and explores embedding large language models (LLMs) into XR spaces as avatars or narratives to enhance inclusion, diversity, and user engagement. It outlines a practical pipeline combining speech-to-text, text-to-speech, and multimodal LLMs with prompt engineering and fine-tuning to support open-ended dialogue in XR. The authors discuss opportunities across inclusion, engagement, and privacy, while outlining challenges such as hallucinations, latency, data storage, and potential privacy invasions from integrating LLM interactions with biometric XR data, advocating privacy-aware design and user-centered controls. The work contributes a research agenda for evaluating privacy attitudes, developing adaptive and inclusive XR interactions, and guiding future empirical studies.

Abstract

Advances in artificial intelligence and human-computer interaction will likely lead to extended reality (XR) becoming pervasive. While XR can provide users with interactive, engaging, and immersive experiences, non-player characters are often utilized in pre-scripted and conventional ways. This paper argues for using large language models (LLMs) in XR by embedding them in avatars or as narratives to facilitate inclusion through prompt engineering and fine-tuning the LLMs. We argue that this inclusion will promote diversity for XR use. Furthermore, the versatile conversational capabilities of LLMs will likely increase engagement in XR, helping XR become ubiquitous. Lastly, we speculate that combining the information provided to LLM-powered spaces by users and the biometric data obtained might lead to novel privacy invasions. While exploring potential privacy breaches, examining user privacy concerns and preferences is also essential. Therefore, despite challenges, LLM-powered XR is a promising area with several opportunities.
Paper Structure (6 sections, 1 figure)

This paper contains 6 sections, 1 figure.

Figures (1)

  • Figure 1: An example of a possible data processing pipeline.