Table of Contents
Fetching ...

F.A.C.U.L.: Language-Based Interaction with AI Companions in Gaming

Wenya Wei, Sipeng Yang, Qixian Zhou, Ruochen Liu, Xuelei Zhang, Yifu Yuan, Yan Jiang, Yongle Luo, Hailong Wang, Tianzhou Wang, Peipei Jin, Wangtong Liu, Zhou Zhao, Xiaogang Jin, Elvis S. Liu

TL;DR

F.A.C.U.L. introduces a real-time language-based AI companion for FPS games that grounds natural language commands in the game environment via confidence-based instruction reasoning and multimodal scene understanding. The framework combines a fast BERT-FID module with a domain-aligned LLM to balance speed and accuracy, and uses dynamic entity retrieval to map commands to in-game assets. Real-world evaluation in Arena Breakout: Infinite demonstrates strong command understanding (87.2% accuracy) and solid real-time performance (≈613 ms latency, ≈916 QPS), supported by user studies favoring language-enabled collaboration over traditional controls. The work offers a practical, scalable path to immersive AI teammates with potential extensions to audio cues and personalized NPC behaviors in diverse genres.

Abstract

In cooperative video games, traditional AI companions are deployed to assist players, who control them using hotkeys or command wheels to issue predefined commands such as ``attack'', ``defend'', or ``retreat''. Despite their simplicity, these methods, which lack target specificity, limit players' ability to give complex tactical instructions and hinder immersive gameplay experiences. To address this problem, we propose the FPS AI Companion who Understands Language (F.A.C.U.L.), the first real-time AI system that enables players to communicate and collaborate with AI companions using natural language. By integrating natural language processing with a confidence-based framework, F.A.C.U.L. efficiently decomposes complex commands and interprets player intent. It also employs a dynamic entity retrieval method for environmental awareness, aligning human intentions with decision-making. Unlike traditional rule-based systems, our method supports real-time language interactions, enabling players to issue complex commands such as ``clear the second floor'', ``take cover behind that tree'', or ``retreat to the river''. The system provides real-time behavioral responses and vocal feedback, ensuring seamless tactical collaboration. Using the popular FPS game \textit{Arena Breakout: Infinite} as a case study, we present comparisons demonstrating the efficacy of our approach and discuss the advantages and limitations of AI companions based on real-world user feedback.

F.A.C.U.L.: Language-Based Interaction with AI Companions in Gaming

TL;DR

F.A.C.U.L. introduces a real-time language-based AI companion for FPS games that grounds natural language commands in the game environment via confidence-based instruction reasoning and multimodal scene understanding. The framework combines a fast BERT-FID module with a domain-aligned LLM to balance speed and accuracy, and uses dynamic entity retrieval to map commands to in-game assets. Real-world evaluation in Arena Breakout: Infinite demonstrates strong command understanding (87.2% accuracy) and solid real-time performance (≈613 ms latency, ≈916 QPS), supported by user studies favoring language-enabled collaboration over traditional controls. The work offers a practical, scalable path to immersive AI teammates with potential extensions to audio cues and personalized NPC behaviors in diverse genres.

Abstract

In cooperative video games, traditional AI companions are deployed to assist players, who control them using hotkeys or command wheels to issue predefined commands such as ``attack'', ``defend'', or ``retreat''. Despite their simplicity, these methods, which lack target specificity, limit players' ability to give complex tactical instructions and hinder immersive gameplay experiences. To address this problem, we propose the FPS AI Companion who Understands Language (F.A.C.U.L.), the first real-time AI system that enables players to communicate and collaborate with AI companions using natural language. By integrating natural language processing with a confidence-based framework, F.A.C.U.L. efficiently decomposes complex commands and interprets player intent. It also employs a dynamic entity retrieval method for environmental awareness, aligning human intentions with decision-making. Unlike traditional rule-based systems, our method supports real-time language interactions, enabling players to issue complex commands such as ``clear the second floor'', ``take cover behind that tree'', or ``retreat to the river''. The system provides real-time behavioral responses and vocal feedback, ensuring seamless tactical collaboration. Using the popular FPS game \textit{Arena Breakout: Infinite} as a case study, we present comparisons demonstrating the efficacy of our approach and discuss the advantages and limitations of AI companions based on real-world user feedback.

Paper Structure

This paper contains 23 sections, 1 equation, 11 figures, 5 tables, 2 algorithms.

Figures (11)

  • Figure 1: F.A.C.U.L. is the first open-interaction and real-time language-operated AI companion system for commercial video games.
  • Figure 2: Overview of F.A.C.U.L.. Players can cooperate with the agent in combat and freely speak into the microphone and get behavior response with voice feedback.
  • Figure 3: Multi-task label structure designed for command segmentation, intent classification, and named entity recognition.
  • Figure 4: Model Architecture of BERT-FID.
  • Figure 5: LLM for actions planning. The LLM planner synthesizes Python code to call functions to orchestrate the actions of F.A.C.U.L. agents, reasoning and acting to address complex tactical tasks.
  • ...and 6 more figures