Differentially private fine-tuned NF-Net to predict GI cancer type
Sai Venkatesh Chilukoti, Imran Hossen Md, Liqun Shan, Vijay Srinivas Tida, Xiali Hei
TL;DR
This work addresses privacy concerns in classifying GI cancer MSI vs MSS from histology by integrating differential privacy with a pre-trained NF-Net feature extractor. The main approach fine-tunes NF-Net with DP-AdamW and its adaptive variant, and systematically studies privacy budgets, noise scales, clipping, and data-balance strategies. Key findings show a privacy-utility trade-off: non-private accuracy reaches $88.98\%$, while private DP-AdamW and Adaptive DP-AdamW achieve $74.58\%$ and $76.48\%$ respectively, with Adaptive DP-AdamW generally performing better across settings; best private performance is $76.48\%$ at $\epsilon=1$ when adding an extra trainable convolutional layer. The work demonstrates a practical DP-enabled pipeline for histopathology GI cancer classification and provides guidance on optimizer choice, layer finetuning, and privacy accounting to balance utility and privacy in real-world deployment.
Abstract
Based on global genomic status, the cancer tumor is classified as Microsatellite Instable (MSI) and Microsatellite Stable (MSS). Immunotherapy is used to diagnose MSI, whereas radiation and chemotherapy are used for MSS. Therefore, it is significant to classify a gastro-intestinal (GI) cancer tumor into MSI vs. MSS to provide appropriate treatment. The existing literature showed that deep learning could directly predict the class of GI cancer tumors from histological images. However, deep learning (DL) models are susceptible to various threats, including membership inference attacks, model extraction attacks, etc. These attacks render the use of DL models impractical in real-world scenarios. To make the DL models useful and maintain privacy, we integrate differential privacy (DP) with DL. In particular, this paper aims to predict the state of GI cancer while preserving the privacy of sensitive data. We fine-tuned the Normalizer Free Net (NF-Net) model. We obtained an accuracy of 88.98\% without DP to predict (GI) cancer status. When we fine-tuned the NF-Net using DP-AdamW and adaptive DP-AdamW, we got accuracies of 74.58% and 76.48%, respectively. Moreover, we investigate the Weighted Random Sampler (WRS) and Class weighting (CW) to solve the data imbalance. We also evaluated and analyzed the DP algorithms in different settings.
