Point Transformer with Federated Learning for Predicting Breast Cancer HER2 Status from Hematoxylin and Eosin-Stained Whole Slide Images
Bao Li, Zhenyu Liu, Lizhi Shao, Bensheng Qiu, Hong Bu, Jie Tian
TL;DR
This work tackles predicting HER2 status from HE-stained WSIs by leveraging a point Transformer within a federated learning framework to handle multi-site, non-i.i.d. data with label imbalance. It introduces two novel components: dynamic distribution adjustment (DDA) to stabilize training under site-specific imbalances and farthest cosine sampling (FCS) to capture long-range dependencies in the WSI patch feature space, augmented by an auxiliary classifier to preserve feature quality. Across four participating sites and unseen external sites, the PointTransformerDDA+ achieves state-of-the-art AUC, closely approaching centralized training performance and demonstrating robustness to data scarcity and variation in IHC2+ cases. The approach highlights the effectiveness of permutation-invariant point-based representations for WSI analysis and offers a privacy-preserving pathway for large-scale biomarker prediction in pathology.
Abstract
Directly predicting human epidermal growth factor receptor 2 (HER2) status from widely available hematoxylin and eosin (HE)-stained whole slide images (WSIs) can reduce technical costs and expedite treatment selection. Accurately predicting HER2 requires large collections of multi-site WSIs. Federated learning enables collaborative training of these WSIs without gigabyte-size WSIs transportation and data privacy concerns. However, federated learning encounters challenges in addressing label imbalance in multi-site WSIs from the real world. Moreover, existing WSI classification methods cannot simultaneously exploit local context information and long-range dependencies in the site-end feature representation of federated learning. To address these issues, we present a point transformer with federated learning for multi-site HER2 status prediction from HE-stained WSIs. Our approach incorporates two novel designs. We propose a dynamic label distribution strategy and an auxiliary classifier, which helps to establish a well-initialized model and mitigate label distribution variations across sites. Additionally, we propose a farthest cosine sampling based on cosine distance. It can sample the most distinctive features and capture the long-range dependencies. Extensive experiments and analysis show that our method achieves state-of-the-art performance at four sites with a total of 2687 WSIs. Furthermore, we demonstrate that our model can generalize to two unseen sites with 229 WSIs.
