Privacy-Preserving Synthetic Continual Semantic Segmentation for Robotic Surgery
Mengya Xu, Mobarakol Islam, Long Bai, Hongliang Ren
TL;DR
This work tackles catastrophic forgetting in continual semantic segmentation for robotic surgery under privacy constraints by introducing a privacy-preserving synthetic framework called CAT-SD. It blends open-source old-instrument foregrounds with synthetic/augmented backgrounds and uses class-aware temperature normalization (CAT) together with multi-scale shifted-feature distillation (SD) to preserve old knowledge while learning new instruments, aided by synthetic pseudo-exemplars generated with StyleGAN-XL and blending/harmonization. The approach outperforms baseline continual learning methods on EndoVis 2017/2018, with ablations confirming the importance of CAT and SD and robustness analyses demonstrating resilience to input perturbations. This framework reduces data collection and privacy risks while enabling continual updates to instrument repertoires in robot-assisted surgery, with potential for incremental domain adaptation in future work.
Abstract
Deep Neural Networks (DNNs) based semantic segmentation of the robotic instruments and tissues can enhance the precision of surgical activities in robot-assisted surgery. However, in biological learning, DNNs cannot learn incremental tasks over time and exhibit catastrophic forgetting, which refers to the sharp decline in performance on previously learned tasks after learning a new one. Specifically, when data scarcity is the issue, the model shows a rapid drop in performance on previously learned instruments after learning new data with new instruments. The problem becomes worse when it limits releasing the dataset of the old instruments for the old model due to privacy concerns and the unavailability of the data for the new or updated version of the instruments for the continual learning model. For this purpose, we develop a privacy-preserving synthetic continual semantic segmentation framework by blending and harmonizing (i) open-source old instruments foreground to the synthesized background without revealing real patient data in public and (ii) new instruments foreground to extensively augmented real background. To boost the balanced logit distillation from the old model to the continual learning model, we design overlapping class-aware temperature normalization (CAT) by controlling model learning utility. We also introduce multi-scale shifted-feature distillation (SD) to maintain long and short-range spatial relationships among the semantic objects where conventional short-range spatial features with limited information reduce the power of feature distillation. We demonstrate the effectiveness of our framework on the EndoVis 2017 and 2018 instrument segmentation dataset with a generalized continual learning setting. Code is available at~\url{https://github.com/XuMengyaAmy/Synthetic_CAT_SD}.
