A 3D Framework for Improving Low-Latency Multi-Channel Live Streaming
Aizierjiang Aiersilan, Zhiqiang Wang
TL;DR
This paper tackles the challenge of delivering low-latency, synchronized multi-channel live streaming under variable network conditions by leveraging a Unity 3D–based framework that maps multiple camera feeds onto virtual canvases in a shared 3D scene and captures them with an in-world camera to produce a single consolidated stream. The approach emphasizes modularity, low latency, and spatial awareness, enabling real-time user interaction and scalable multi-channel handling while supporting VR/AR/MR contexts. Key findings show a latency reduction of up to 68.7% compared with baselines, consistent latency across channel counts, zero synchronization offset due to consolidation, and robust scalability up to 50 devices, albeit with a trade-off in video quality as channels increase. Practically, the framework offers a flexible, open-source solution for low-latency multi-channel streaming applicable to virtual events, remote collaboration, and other immersive applications, with broad protocol compatibility and a concrete theoretical model for data transmission.
Abstract
The advent of 5G has driven the demand for high-quality, low-latency live streaming. However, challenges such as managing the increased data volume, ensuring synchronization across multiple streams, and maintaining consistent quality under varying network conditions persist, particularly in real-time video streaming. To address these issues, we propose a novel framework that leverages 3D virtual environments within game engines (e.g., Unity 3D) to optimize multi-channel live streaming. Our approach consolidates multi-camera video data into a single stream using multiple virtual 3D canvases, significantly increasing channel amounts while reducing latency and enhancing user flexibility. For demonstration of our approach, we utilize the Unity 3D engine to integrate multiple video inputs into a single-channel stream, supporting one-to-many broadcasting, one-to-one video calling, and real-time control of video channels. By mapping video data onto a world-space canvas and capturing it via an in-world camera, we minimize redundant data transmission, achieving efficient, low-latency streaming. Our results demonstrate that this method outperforms some existing multi-channel live streaming solutions in both latency reduction and user interaction responsiveness improvement. Our live video streaming system affiliated with this paper is also open-source at https://github.com/Aizierjiang/LiveStreaming.
