3D Holistic OR Anonymization
Tony Danjun Wang
TL;DR
This work tackles privacy-preserving analysis of multi-view RGB-D operating-room videos by introducing a 3D-centric anonymization pipeline that first localizes faces in 3D and then reprojects texture-mapped replacements into all views, preserving the data distribution for downstream tasks. It contributes a new multi-view OR RGB-D dataset captured during real swine-based laparoscopic procedures, along with a complete pipeline that fuses 3D key-points, SMPL-based mesh fitting, and texture rendering via an adversarial autoencoder, together with occlusion-aware back-projection. Through extensive evaluation, the approach achieves superior face localization in challenging OR views and produces more realistic anonymized faces than state-of-the-art 2D-detection–based methods and GAN-based baselines, while maintaining task-relevant information better than naive obfuscation. The work highlights the practical impact of leveraging 3D information for privacy in surgical data, outlines limitations (e.g., dependence on 3D key-points, two-step mesh fitting), and points to open-source tooling for reproducibility and future improvements.
Abstract
We propose a novel method that leverages 3D information to automatically anonymize multi-view RGB-D video recordings of operating rooms (OR). Our anonymization method preserves the original data distribution by replacing the faces in each image with different faces so that the data remains suitable for further downstream tasks. In contrast to established anonymization methods, our approach localizes faces in 3D space first rather than in 2D space. Each face is then anonymized by reprojecting a different face back into each camera view, ultimately replacing the original faces in the resulting images. Furthermore, we introduce a multi-view RGB-D dataset, captured during a real operation of experienced surgeons performing laparoscopic surgery on an animal object (swine), which encapsulates typical characteristics of ORs. Finally, we present experimental results evaluated on that dataset, showing that leveraging 3D data can achieve better face localization in OR images and generate more realistic faces than the current state-of-the-art. There has been, to our knowledge, no prior work that addresses the anonymization of multi-view OR recordings, nor 2D face localization that leverages 3D information.
