How Effective Can Dropout Be in Multiple Instance Learning ?
Wenhui Zhu, Peijie Qiu, Xiwen Chen, Zhangsihao Yang, Aristeidis Sotiras, Abolfazl Razi, Yalin Wang
TL;DR
This work investigates dropout in multiple instance learning (MIL), with a focus on histological whole-slide image (WSI) classification where training is typically two-stage and feature embeddings are noisy. It reveals that dropping the top-k most important instances (top-k DropInstance) reduces gradient direction error and promotes flatter, more generalizable minima, and this insight motivates MIL-Dropout, a MIL-specific dropout method. MIL-Dropout uses a non-parametric averaging-based attention to rank instance importance and a query-based mechanism to drop the top-k and their similar instances, with normalization to stabilize training. Empirical results on five MIL benchmarks and two WSI datasets show consistent, substantial gains across diverse MIL aggregators at negligible computational cost, complemented by ablations on hyperparameters and lesion localization analyses. The findings offer both theoretical and practical contributions to regularizing MIL in challenging, weakly supervised settings like digital pathology.
Abstract
Multiple Instance Learning (MIL) is a popular weakly-supervised method for various applications, with a particular interest in histological whole slide image (WSI) classification. Due to the gigapixel resolution of WSI, applications of MIL in WSI typically necessitate a two-stage training scheme: first, extract features from the pre-trained backbone and then perform MIL aggregation. However, it is well-known that this suboptimal training scheme suffers from "noisy" feature embeddings from the backbone and inherent weak supervision, hindering MIL from learning rich and generalizable features. However, the most commonly used technique (i.e., dropout) for mitigating this issue has yet to be explored in MIL. In this paper, we empirically explore how effective the dropout can be in MIL. Interestingly, we observe that dropping the top-k most important instances within a bag leads to better performance and generalization even under noise attack. Based on this key observation, we propose a novel MIL-specific dropout method, termed MIL-Dropout, which systematically determines which instances to drop. Experiments on five MIL benchmark datasets and two WSI datasets demonstrate that MIL-Dropout boosts the performance of current MIL methods with a negligible computational cost. The code is available at https://github.com/ChongQingNoSubway/MILDropout.
