Weakly Supervised Object Detection in Chest X-Rays with Differentiable ROI Proposal Networks and Soft ROI Pooling
Philip Müller, Felix Meissen, Georgios Kaissis, Daniel Rueckert
TL;DR
This work tackles the challenge of localizing pathologies in chest X-rays with only image-level labels. It introduces Weakly Supervised ROI Proposal Networks (WSRPN), a differentiable, end-to-end system that learns bounding box proposals via ROI attention and Gaussian ROI pooling within a two-branch MIL framework (patch and ROI branches) and a consistency loss between branches. On ChestXray-8, WSRPN achieves state-of-the-art results across RoDeO, AP, and localization metrics, with extensive ablations demonstrating the necessity of components such as the loss terms, ROI tokens, and the Gaussian pooling scheme. The approach enables end-to-end optimization of box parameters under weak supervision, offering practical clinical value and potential extensions to multimodal or semi-supervised settings.
Abstract
Weakly supervised object detection (WSup-OD) increases the usefulness and interpretability of image classification algorithms without requiring additional supervision. The successes of multiple instance learning in this task for natural images, however, do not translate well to medical images due to the very different characteristics of their objects (i.e. pathologies). In this work, we propose Weakly Supervised ROI Proposal Networks (WSRPN), a new method for generating bounding box proposals on the fly using a specialized region of interest-attention (ROI-attention) module. WSRPN integrates well with classic backbone-head classification algorithms and is end-to-end trainable with only image-label supervision. We experimentally demonstrate that our new method outperforms existing methods in the challenging task of disease localization in chest X-ray images. Code: https://github.com/philip-mueller/wsrpn
