Personalizing Federated Instrument Segmentation with Visual Trait Priors in Robotic Surgery
Jialang Xu, Jiacheng Wang, Lequan Yu, Danail Stoyanov, Yueming Jin, Evangelos B. Mazomenos
TL;DR
This work introduces PFedSIS, a personalized federated learning framework for surgical instrument segmentation that encodes visual trait priors through three components: Global-Personalized Disentanglement (GPD) for head-wise self-attention personalization, Appearance-regulation Personalized Enhancement (APE) to align local appearance via hypernetwork-guided updates, and Shape-similarity Global Enhancement (SGE) to preserve cross-site shape information. By decoupling global and personalized parameters and incorporating style-memory-based cross-style augmentation, PFedSIS achieves statistically significant improvements in Dice and IoU while reducing segmentation boundary errors across three diverse surgical datasets. The approach maintains real-time inference and demonstrates robustness to appearance heterogeneity and inter-site instrument-shape similarity, offering a privacy-preserving path toward site-tailored SIS models with practical clinical impact.
Abstract
Personalized federated learning (PFL) for surgical instrument segmentation (SIS) is a promising approach. It enables multiple clinical sites to collaboratively train a series of models in privacy, with each model tailored to the individual distribution of each site. Existing PFL methods rarely consider the personalization of multi-headed self-attention, and do not account for appearance diversity and instrument shape similarity, both inherent in surgical scenes. We thus propose PFedSIS, a novel PFL method with visual trait priors for SIS, incorporating global-personalized disentanglement (GPD), appearance-regulation personalized enhancement (APE), and shape-similarity global enhancement (SGE), to boost SIS performance in each site. GPD represents the first attempt at head-wise assignment for multi-headed self-attention personalization. To preserve the unique appearance representation of each site and gradually leverage the inter-site difference, APE introduces appearance regulation and provides customized layer-wise aggregation solutions via hypernetworks for each site's personalized parameters. The mutual shape information of instruments is maintained and shared via SGE, which enhances the cross-style shape consistency on the image level and computes the shape-similarity contribution of each site on the prediction level for updating the global parameters. PFedSIS outperforms state-of-the-art methods with +1.51% Dice, +2.11% IoU, -2.79 ASSD, -15.55 HD95 performance gains. The corresponding code and models will be released at https://github.com/wzjialang/PFedSIS.
