Heads Up eXperience (HUX): Always-On AI Companion for Human Computer Environment Interaction
Sukanth K, Sudhiksha Kandavel Rajan, Rajashekhar V S, Gowdham Prabhakar
TL;DR
This work introduces Heads Up eXperience (HUX) as an always-on AI companion for human–computer environment interaction in XR smart glasses, combining eye-gaze, real-time scene analysis, and speech with a multi-modal memory system to support task-specific decisions. The architecture integrates a vision–language model, a large language model, and modular task-specific detectors, enabling real-time EOIs/OOIs detection, task-focused scene enhancement, and memory retrieval via Retrieval-Augmented Generation. Key contributions include a modular HUX AI architecture, an event-driven video filtering pipeline, task-specific scene processing with modular model switching, and a memory framework that supports context-rich, multi-modal recall. The approach promises to transform personal and professional interactions with technology by delivering natural, context-aware assistance directly in XR glasses, paving the way for deeper human–AI collaboration in daily life.
Abstract
While current personal smart devices excel in digital domains, they fall short in assisting users during human environment interaction. This paper proposes Heads Up eXperience (HUX), an AI system designed to bridge this gap, serving as a constant companion across the extended reality (XR) environments. By tracking the user's eye gaze, analyzing the surrounding environment, and interpreting verbal contexts, the system captures and enhances multi-modal data, providing holistic context interpretation and memory storage in real-time task specific situations. This comprehensive approach enables more natural, empathetic and intelligent interactions between the user and HUX AI, paving the path for human computer environment interaction. Intended for deployment in smart glasses and extended reality headsets, HUX AI aims to become a personal and useful AI companion for daily life. By integrating digital assistance with enhanced physical world interactions, this technology has the potential to revolutionize human-AI collaboration in both personal and professional spheres paving the way for the future of personal smart devices.
