Data Alchemy: Mitigating Cross-Site Model Variability Through Test Time Data Calibration

Abhijeet Parida; Antonia Alomar; Zhifan Jiang; Pooneh Roshanitabrizi; Austin Tapp; Maria Ledesma-Carbayo; Ziyue Xu; Syed Muhammed Anwar; Marius George Linguraru; Holger R. Roth

Data Alchemy: Mitigating Cross-Site Model Variability Through Test Time Data Calibration

Abhijeet Parida, Antonia Alomar, Zhifan Jiang, Pooneh Roshanitabrizi, Austin Tapp, Maria Ledesma-Carbayo, Ziyue Xu, Syed Muhammed Anwar, Marius George Linguraru, Holger R. Roth

TL;DR

This work tackles cross-site variability in histopathology imaging and regulatory barriers to site-specific model updates. It introduces Data Alchemy, an explainable stain normalization method coupled with test-time data calibration that learns a site-alignment template without altering model weights. Empirical results on CAMELYON16 demonstrate improved stain fidelity (lower $cycleL1$, higher $SSIM$/$PSNR$ and $AP(i,p)$) and substantial cross-site gains in tumor-classification performance (AUPR up to $0.852$, often nearing or surpassing an upper-bound model). By enabling rapid, regulatory-friendly deployment of pre-trained clinical tools across sites, Data Alchemy reduces domain gaps with minimal operational overhead and enhances generalizability in digital pathology.

Abstract

Deploying deep learning-based imaging tools across various clinical sites poses significant challenges due to inherent domain shifts and regulatory hurdles associated with site-specific fine-tuning. For histopathology, stain normalization techniques can mitigate discrepancies, but they often fall short of eliminating inter-site variations. Therefore, we present Data Alchemy, an explainable stain normalization method combined with test time data calibration via a template learning framework to overcome barriers in cross-site analysis. Data Alchemy handles shifts inherent to multi-site data and minimizes them without needing to change the weights of the normalization or classifier networks. Our approach extends to unseen sites in various clinical settings where data domain discrepancies are unknown. Extensive experiments highlight the efficacy of our framework in tumor classification in hematoxylin and eosin-stained patches. Our explainable normalization method boosts classification tasks' area under the precision-recall curve(AUPR) by 0.165, 0.545 to 0.710. Additionally, Data Alchemy further reduces the multisite classification domain gap, by improving the 0.710 AUPR an additional 0.142, elevating classification performance further to 0.852, from 0.545. Our Data Alchemy framework can popularize precision medicine with minimal operational overhead by allowing for the seamless integration of pre-trained deep learning-based clinical tools across multiple sites.

Data Alchemy: Mitigating Cross-Site Model Variability Through Test Time Data Calibration

TL;DR

, higher

and

) and substantial cross-site gains in tumor-classification performance (AUPR up to

, often nearing or surpassing an upper-bound model). By enabling rapid, regulatory-friendly deployment of pre-trained clinical tools across sites, Data Alchemy reduces domain gaps with minimal operational overhead and enhances generalizability in digital pathology.

Abstract

Paper Structure (18 sections, 7 figures, 3 tables)

This paper contains 18 sections, 7 figures, 3 tables.

Introduction
Methods and Experimental Settings
Explainable stain normalization
Downstream classification task
Data Alchemy: Test time data calibration
Dataset
Evaluation metrics
Results
Comparison with other stain normalization techniques
Stain normalization on downstream task
Test time data calibration
Conclusion
Acknowledgements
Data splits
site A split json
...and 3 more sections

Figures (7)

Figure 1: Stain normalization during training (left) and inference (right).
Figure 2: Data Alchemy uses the normalization network and a classifier to learn a template at test time to improve the classifier performance when deployed at a site.
Figure 3: Qualitative visualization of stain normalization. A detailed version is available in appendix \ref{['app:stain_norm-comp']}.
Figure 4: Eigenvalue and vectors of a content, stain, and stain normalized patch projected onto a 3D sphere. Each point of the surface represents an eigenvector and the color represents an eigenvalue. Red signifies a smaller and blue higher eigenvalues.
Figure 5: Eigenvalue blending of the content and stain patches produces different staining. The (%) in the arrows are the content and stain patches blending weights.
...and 2 more figures

Data Alchemy: Mitigating Cross-Site Model Variability Through Test Time Data Calibration

TL;DR

Abstract

Data Alchemy: Mitigating Cross-Site Model Variability Through Test Time Data Calibration

Authors

TL;DR

Abstract

Table of Contents

Figures (7)