Task-oriented and Semantics-aware Communications for Augmented Reality
Zhe Wang, Yansha Deng
TL;DR
This work tackles the high bandwidth and ultra-low latency requirements of AR in the metaverse by introducing TSAR, a task-oriented and semantics-aware communication framework. TSAR replaces raw point-cloud transmission with semantic information derived from SANet, coupled with task-focused base knowledge and avatar pose recovery, to dramatically reduce data volume and latency. The authors demonstrate substantial improvements over traditional point-cloud AR, including up to 95.6% latency reduction and geometry/color fidelity gains (up to 82.4% and 20.4%, respectively), and they explore an Euler-angle variant (E-TSAR) and ablation studies on base knowledge. The framework promises scalable, efficient AR communication by emphasizing semantic relevance and task-level optimization, with potential broad impact on metaverse applications and AR UX.
Abstract
Upon the advent of the emerging metaverse and its related applications in Augmented Reality (AR), the current bit-oriented network struggles to support real-time changes for the vast amount of associated information, creating a significant bottleneck in its development. To address the above problem, we present a novel task-oriented and semantics-aware communication framework for augmented reality (TSAR) to enhance communication efficiency and effectiveness significantly. We first present an analysis of the traditional wireless AR point cloud communication framework, followed by a detailed summary of our proposed semantic information extraction within the end-to-end communication. Then, we detail the components of the TSAR framework, incorporating semantics extraction with deep learning, task-oriented base knowledge selection, and avatar pose recovery. Through rigorous experimentation, we demonstrate that our proposed TSAR framework considerably outperforms traditional point cloud communication framework, reducing wireless AR application transmission latency by 95.6% and improving communication effectiveness in geometry and color aspects by up to 82.4% and 20.4%, respectively.
