Table of Contents
Fetching ...

X-Mark: Saliency-Guided Robust Dataset Ownership Verification for Medical Imaging

Pranav Kulkarni, Junfeng Guo, Heng Huang

TL;DR

This work tackles the challenge of protecting medical imaging datasets from unauthorized use by introducing X-Mark, a sample-specific clean-label backdoor watermarking method for chest X-rays. It uses a conditional U-Net with EigenCAM-based saliency conditioning and Laplacian regularization to generate perturbations that survive downsampling while remaining diagnostically acceptable. The method demonstrates 100% watermark success and strong black-box verification performance on CheXpert, with robust transferability across resolutions and model architectures, and resilience to adaptive attacks such as fine-tuning and pruning. This approach offers a practical, imperceptible, and scalable mechanism for dataset ownership verification in clinical imaging, with potential extensions to other modalities and tasks.

Abstract

High-quality medical imaging datasets are essential for training deep learning models, but their unauthorized use raises serious copyright and ethical concerns. Medical imaging presents a unique challenge for existing dataset ownership verification methods designed for natural images, as static watermark patterns generated in fixed-scale images scale poorly dynamic and high-resolution scans with limited visual diversity and subtle anatomical structures, while preserving diagnostic quality. In this paper, we propose X-Mark, a sample-specific clean-label watermarking method for chest x-ray copyright protection. Specifically, X-Mark uses a conditional U-Net to generate unique perturbations within salient regions of each sample. We design a multi-component training objective to ensure watermark efficacy, robustness against dynamic scaling processes while preserving diagnostic quality and visual-distinguishability. We incorporate Laplacian regularization into our training objective to penalize high-frequency perturbations and achieve watermark scale-invariance. Ownership verification is performed in a black-box setting to detect characteristic behaviors in suspicious models. Extensive experiments on CheXpert verify the effectiveness of X-Mark, achieving WSR of 100% and reducing probability of false positives in Ind-M scenario by 12%, while demonstrating resistance to potential adaptive attacks.

X-Mark: Saliency-Guided Robust Dataset Ownership Verification for Medical Imaging

TL;DR

This work tackles the challenge of protecting medical imaging datasets from unauthorized use by introducing X-Mark, a sample-specific clean-label backdoor watermarking method for chest X-rays. It uses a conditional U-Net with EigenCAM-based saliency conditioning and Laplacian regularization to generate perturbations that survive downsampling while remaining diagnostically acceptable. The method demonstrates 100% watermark success and strong black-box verification performance on CheXpert, with robust transferability across resolutions and model architectures, and resilience to adaptive attacks such as fine-tuning and pruning. This approach offers a practical, imperceptible, and scalable mechanism for dataset ownership verification in clinical imaging, with potential extensions to other modalities and tasks.

Abstract

High-quality medical imaging datasets are essential for training deep learning models, but their unauthorized use raises serious copyright and ethical concerns. Medical imaging presents a unique challenge for existing dataset ownership verification methods designed for natural images, as static watermark patterns generated in fixed-scale images scale poorly dynamic and high-resolution scans with limited visual diversity and subtle anatomical structures, while preserving diagnostic quality. In this paper, we propose X-Mark, a sample-specific clean-label watermarking method for chest x-ray copyright protection. Specifically, X-Mark uses a conditional U-Net to generate unique perturbations within salient regions of each sample. We design a multi-component training objective to ensure watermark efficacy, robustness against dynamic scaling processes while preserving diagnostic quality and visual-distinguishability. We incorporate Laplacian regularization into our training objective to penalize high-frequency perturbations and achieve watermark scale-invariance. Ownership verification is performed in a black-box setting to detect characteristic behaviors in suspicious models. Extensive experiments on CheXpert verify the effectiveness of X-Mark, achieving WSR of 100% and reducing probability of false positives in Ind-M scenario by 12%, while demonstrating resistance to potential adaptive attacks.
Paper Structure (35 sections, 9 equations, 7 figures, 3 tables)

This paper contains 35 sections, 9 equations, 7 figures, 3 tables.

Figures (7)

  • Figure 1: The main pipeline of X-Mark. First, a conditional U-Net is trained to generate sample-specific watermarks within salient regions of the medical image. Second, the watermarked dataset is created by embedding watermarks within a subset of target class samples and combining them with the remaining dataset. Finally, black-box dataset ownership verification is performed using hypothesis testing to detect whether watermarked, non-target class samples were misclassified by the suspicious model.
  • Figure 2: Example watermarked samples from SSCL-BW and X-Mark. Red box indicates region of strong perturbations, resulting in anatomically improbable structures that are easy to detect upon manual inspection. Saliency conditioning limits perturbations within salient regions (chest) while Laplacian regularization mitigates strong, unrealistic perturbations.
  • Figure 3: Watermarked samples and their EigenCAM-based saliency maps from backdoored models using BadNets, SSCL-BW, and X-Mark. For BadNets, the EigenCAM mainly focuses on the trigger, while focusing on regions with the largest perturbations in SSCL-BW. In contrast, in our method, the EigenCAM focuses on the salient regions (i.e., the chest), making the backdoor difficult to detect even with automated methods.
  • Figure 4: Transferability of X-Mark, demonstrating watermark scale-invariance and model-agnostic transferability.
  • Figure 5: An example illustrating the impact of EigenCAM-based saliency conditioning and Laplacian regularization.
  • ...and 2 more figures