Learning to diagnose from scratch by exploiting dependencies among labels
Li Yao, Eric Poblenz, Dmitry Dagunts, Ben Covington, Devon Bernard, Kevin Lyman
TL;DR
This work tackles multi-label chest X-ray diagnosis under data scarcity by proposing a two-stage architecture: a DenseNet-inspired encoder and an LSTM-based decoder. It demonstrates that training from scratch without pretraining can outperform prior state-of-the-art methods and shows additional gains by explicitly modeling dependencies among abnormalities using a fixed-length, sigmoid-decoded sequence. A comprehensive set of clinically relevant metrics is introduced to benchmark performance beyond traditional accuracy, and the authors discuss potential biases from learned interdependencies and directions for ontology-informed improvements. Overall, the approach achieves strong, interpretable performance on ChestX-ray8 and highlights the practical value of dependency-aware decoding in medical imaging.
Abstract
The field of medical diagnostics contains a wealth of challenges which closely resemble classical machine learning problems; practical constraints, however, complicate the translation of these endpoints naively into classical architectures. Many tasks in radiology, for example, are largely problems of multi-label classification wherein medical images are interpreted to indicate multiple present or suspected pathologies. Clinical settings drive the necessity for high accuracy simultaneously across a multitude of pathological outcomes and greatly limit the utility of tools which consider only a subset. This issue is exacerbated by a general scarcity of training data and maximizes the need to extract clinically relevant features from available samples -- ideally without the use of pre-trained models which may carry forward undesirable biases from tangentially related tasks. We present and evaluate a partial solution to these constraints in using LSTMs to leverage interdependencies among target labels in predicting 14 pathologic patterns from chest x-rays and establish state of the art results on the largest publicly available chest x-ray dataset from the NIH without pre-training. Furthermore, we propose and discuss alternative evaluation metrics and their relevance in clinical practice.
