Class Imbalance Correction for Improved Universal Lesion Detection and Tagging in CT

Peter D. Erickson; Tejas Sudharshan Mathai; Ronald M. Summers

Class Imbalance Correction for Improved Universal Lesion Detection and Tagging in CT

Peter D. Erickson, Tejas Sudharshan Mathai, Ronald M. Summers

TL;DR

The paper tackles the problem that the public DeepLesion CT dataset shows severe imbalance across body-part lesion labels, missing annotations, and tagging inconsistencies, which can hinder automatic universal lesion detection and tagging. Using a limited annotated subset $D_L$, the authors train a state-of-the-art detector VFNet and compare with Faster RCNN, RetinaNet, and FoveaBox, applying $WBF$ to combine predictions. They propose three data-balancing experiments $E_{BP}$, $E_{N}$, and $E_{S}$ (and a random unbalanced baseline $E_{U}$) to balance by body part, per-patient lesion count, and lesion size. Results show that balancing by body part labels increases recall for under-represented classes across models and balancing by lesion size boosts recall for VFNet across all classes, with an accompanying clinically useful structured reporting guideline for the radiology report.

Abstract

Radiologists routinely detect and size lesions in CT to stage cancer and assess tumor burden. To potentially aid their efforts, multiple lesion detection algorithms have been developed with a large public dataset called DeepLesion (32,735 lesions, 32,120 CT slices, 10,594 studies, 4,427 patients, 8 body part labels). However, this dataset contains missing measurements and lesion tags, and exhibits a severe imbalance in the number of lesions per label category. In this work, we utilize a limited subset of DeepLesion (6\%, 1331 lesions, 1309 slices) containing lesion annotations and body part label tags to train a VFNet model to detect lesions and tag them. We address the class imbalance by conducting three experiments: 1) Balancing data by the body part labels, 2) Balancing data by the number of lesions per patient, and 3) Balancing data by the lesion size. In contrast to a randomly sampled (unbalanced) data subset, our results indicated that balancing the body part labels always increased sensitivity for lesions >= 1cm for classes with low data quantities (Bone: 80\% vs. 46\%, Kidney: 77\% vs. 61\%, Soft Tissue: 70\% vs. 60\%, Pelvis: 83\% vs. 76\%). Similar trends were seen for three other models tested (FasterRCNN, RetinaNet, FoveaBox). Balancing data by lesion size also helped the VFNet model improve recalls for all classes in contrast to an unbalanced dataset. We also provide a structured reporting guideline for a ``Lesions'' subsection to be entered into the ``Findings'' section of a radiology report. To our knowledge, we are the first to report the class imbalance in DeepLesion, and have taken data-driven steps to address it in the context of joint lesion detection and tagging.

Class Imbalance Correction for Improved Universal Lesion Detection and Tagging in CT

TL;DR

Abstract

Class Imbalance Correction for Improved Universal Lesion Detection and Tagging in CT

TL;DR

Abstract

Paper Structure

Table of Contents

Figures (2)