VVRec: Reconstruction Attacks on DL-based Volumetric Video Upstreaming via Latent Diffusion Model with Gamma Distribution
Rui Lu, Bihai Zhang, Dan Wang
TL;DR
This work tackles privacy risks in DL-based volumetric video upstreaming by demonstrating a reconstruction attack that recovers original point clouds from intercepted intermediate results. It introduces VVRec, a four-module attacker leveraging latent diffusion models with a Gamma distribution (SLDM, LPGDM, LPG, RD) plus a Point Cloud Refinement (PCR) step to produce high-quality reconstructions. Across four volumetric datasets and multiple victim models, VVRec achieves substantial reconstruction quality, including color recovery, and outperforms prior baselines while exposing limited effectiveness of simple protective perturbations. The results highlight a concrete privacy threat in volumetric video streaming and motivate development of robust defense mechanisms.
Abstract
With the popularity of 3D volumetric video applications, such as Autonomous Driving, Virtual Reality, and Mixed Reality, current developers have turned to deep learning for compressing volumetric video frames, i.e., point clouds for video upstreaming. The latest deep learning-based solutions offer higher efficiency, lower distortion, and better hardware support compared to traditional ones like MPEG and JPEG. However, privacy threats arise, especially reconstruction attacks targeting to recover the original input point cloud from the intermediate results. In this paper, we design VVRec, to the best of our knowledge, which is the first targeting DL-based Volumetric Video Reconstruction attack scheme. VVRec demonstrates the ability to reconstruct high-quality point clouds from intercepted transmission intermediate results using four well-trained neural network modules we design. Leveraging the latest latent diffusion models with Gamma distribution and a refinement algorithm, VVRec excels in reconstruction quality, color recovery, and surpasses existing defenses. We evaluate VVRec using three volumetric video datasets. The results demonstrate that VVRec achieves 64.70dB reconstruction accuracy, with an impressive 46.39% reduction of distortion over baselines.
