Learning from Partial Label Proportions for Whole Slide Image Segmentation
Shinnosuke Matsuo, Daiki Suehiro, Seiichi Uchida, Hiroaki Ito, Kazuhiro Terada, Akihiko Yoshizawa, Ryoma Bise
TL;DR
The paper tackles segmenting tumor subtypes in whole slide images (WSIs) when only partial label proportions ${\boldsymbol p}^i \in \Delta^C$ are available, i.e., proportions among $C$ tumor subtypes within $C+1$ classes with the non-tumor class unlabeled. It decomposes this learning-from-partial-proportions problem (LPLP) into two weaker subproblems—multiple instance learning (MIL) and learning from label proportions (LLP)—and introduces a unified end-to-end architecture that uses differentiable positive-instance masks to share information between MIL and LLP, with a loss ${\mathcal L} = {\mathcal L}_\mathrm{LLP}({\boldsymbol p}^i, {\hat{\boldsymbol p}}^i) + w_{\mathrm{MIL}} {\mathcal L}_\mathrm{MIL}(Y^i, {\hat{S}}^i)$. Evaluations on CRC100K and a private chemotherapy dataset show that the method achieves competitive accuracy and mean IoU, closely matching fully supervised baselines and outperforming other LPLP baselines, thereby enabling effective tumor-subtype segmentation under weak supervision. The use of negative (healthy) WSIs as negative bags helps disambiguate non-tumor regions, making the approach practical for clinical-scale WSI analysis. Overall, the work provides a scalable framework for WSI segmentation with partial supervisory signals, reducing annotation burdens while maintaining high performance.
Abstract
In this paper, we address the segmentation of tumor subtypes in whole slide images (WSI) by utilizing incomplete label proportions. Specifically, we utilize `partial' label proportions, which give the proportions among tumor subtypes but do not give the proportion between tumor and non-tumor. Partial label proportions are recorded as the standard diagnostic information by pathologists, and we, therefore, want to use them for realizing the segmentation model that can classify each WSI patch into one of the tumor subtypes or non-tumor. We call this problem ``learning from partial label proportions (LPLP)'' and formulate the problem as a weakly supervised learning problem. Then, we propose an efficient algorithm for this challenging problem by decomposing it into two weakly supervised learning subproblems: multiple instance learning (MIL) and learning from label proportions (LLP). These subproblems are optimized efficiently in the end-to-end manner. The effectiveness of our algorithm is demonstrated through experiments conducted on two WSI datasets.
