Table of Contents
Fetching ...

Secure AI-Driven Super-Resolution for Real-Time Mixed Reality Applications

Mohammad Waquas Usmani, Sankalpa Timilsina, Michael Zink, Susmit Shannigrahi

TL;DR

The paper tackles the high bandwidth and motion-to-photon latency challenge in immersive MR streaming by integrating point-cloud downsampling at the origin, CP-ABE-based partial encryption, and client-side ML-driven super-resolution. It evaluates a secure SR distribution pipeline using LivingRoom and Office datasets, demonstrating near-linear reductions in bandwidth, latency, and crypto overhead as downsampling increases, while the Random Forest SR model achieves sub-millimeter geometric accuracy with modest inference times. The work uniquely combines cryptographic access control with ML-based upsampling, enabling scalable secure delivery of volumetric content and providing a practical framework for future end-to-end streaming with ABR and edge-accelerated inference. Overall, the approach offers a viable path to reduce data transfer and latency in MR applications without compromising access control, with clear avenues for real-time deployment and optimization.

Abstract

Immersive formats such as 360° and 6DoF point cloud videos require high bandwidth and low latency, posing challenges for real-time AR/VR streaming. This work focuses on reducing bandwidth consumption and encryption/decryption delay, two key contributors to overall latency. We design a system that downsamples point cloud content at the origin server and applies partial encryption. At the client, the content is decrypted and upscaled using an ML-based super-resolution model. Our evaluation demonstrates a nearly linear reduction in bandwidth/latency, and encryption/decryption overhead with lower downsampling resolutions, while the super-resolution model effectively reconstructs the original full-resolution point clouds with minimal error and modest inference time.

Secure AI-Driven Super-Resolution for Real-Time Mixed Reality Applications

TL;DR

The paper tackles the high bandwidth and motion-to-photon latency challenge in immersive MR streaming by integrating point-cloud downsampling at the origin, CP-ABE-based partial encryption, and client-side ML-driven super-resolution. It evaluates a secure SR distribution pipeline using LivingRoom and Office datasets, demonstrating near-linear reductions in bandwidth, latency, and crypto overhead as downsampling increases, while the Random Forest SR model achieves sub-millimeter geometric accuracy with modest inference times. The work uniquely combines cryptographic access control with ML-based upsampling, enabling scalable secure delivery of volumetric content and providing a practical framework for future end-to-end streaming with ABR and edge-accelerated inference. Overall, the approach offers a viable path to reduce data transfer and latency in MR applications without compromising access control, with clear avenues for real-time deployment and optimization.

Abstract

Immersive formats such as 360° and 6DoF point cloud videos require high bandwidth and low latency, posing challenges for real-time AR/VR streaming. This work focuses on reducing bandwidth consumption and encryption/decryption delay, two key contributors to overall latency. We design a system that downsamples point cloud content at the origin server and applies partial encryption. At the client, the content is decrypted and upscaled using an ML-based super-resolution model. Our evaluation demonstrates a nearly linear reduction in bandwidth/latency, and encryption/decryption overhead with lower downsampling resolutions, while the super-resolution model effectively reconstructs the original full-resolution point clouds with minimal error and modest inference time.

Paper Structure

This paper contains 23 sections, 1 equation, 7 figures, 2 tables.

Figures (7)

  • Figure 1: System architecture. Point clouds are downsampled from full-100% resolution to 50% and 25%, and 12.5%, encrypted using ABE, stored at the origin, and delivered via CDN. Clients decrypt and upsample the frames using an AI/ML model.
  • Figure 2: Visuals of the original point clouds: 196k, 253k, 282k, 310k, 333k, 380k, 430k, 805k.
  • Figure 3: Encryption and decryption times, and encrypted data sizes across various resolutions for different point cloud sizes.
  • Figure 4: Chamfer and Hausdorff distances for Model A (cross-domain) under different input densities upsampled to 100%.
  • Figure 5: Chamfer and Hausdorff distances for Model B (mixed-domain) under different input densities upsampled to 100%.
  • ...and 2 more figures