Table of Contents
Fetching ...

Learning to diagnose from scratch by exploiting dependencies among labels

Li Yao, Eric Poblenz, Dmitry Dagunts, Ben Covington, Devon Bernard, Kevin Lyman

TL;DR

This work tackles multi-label chest X-ray diagnosis under data scarcity by proposing a two-stage architecture: a DenseNet-inspired encoder and an LSTM-based decoder. It demonstrates that training from scratch without pretraining can outperform prior state-of-the-art methods and shows additional gains by explicitly modeling dependencies among abnormalities using a fixed-length, sigmoid-decoded sequence. A comprehensive set of clinically relevant metrics is introduced to benchmark performance beyond traditional accuracy, and the authors discuss potential biases from learned interdependencies and directions for ontology-informed improvements. Overall, the approach achieves strong, interpretable performance on ChestX-ray8 and highlights the practical value of dependency-aware decoding in medical imaging.

Abstract

The field of medical diagnostics contains a wealth of challenges which closely resemble classical machine learning problems; practical constraints, however, complicate the translation of these endpoints naively into classical architectures. Many tasks in radiology, for example, are largely problems of multi-label classification wherein medical images are interpreted to indicate multiple present or suspected pathologies. Clinical settings drive the necessity for high accuracy simultaneously across a multitude of pathological outcomes and greatly limit the utility of tools which consider only a subset. This issue is exacerbated by a general scarcity of training data and maximizes the need to extract clinically relevant features from available samples -- ideally without the use of pre-trained models which may carry forward undesirable biases from tangentially related tasks. We present and evaluate a partial solution to these constraints in using LSTMs to leverage interdependencies among target labels in predicting 14 pathologic patterns from chest x-rays and establish state of the art results on the largest publicly available chest x-ray dataset from the NIH without pre-training. Furthermore, we propose and discuss alternative evaluation metrics and their relevance in clinical practice.

Learning to diagnose from scratch by exploiting dependencies among labels

TL;DR

This work tackles multi-label chest X-ray diagnosis under data scarcity by proposing a two-stage architecture: a DenseNet-inspired encoder and an LSTM-based decoder. It demonstrates that training from scratch without pretraining can outperform prior state-of-the-art methods and shows additional gains by explicitly modeling dependencies among abnormalities using a fixed-length, sigmoid-decoded sequence. A comprehensive set of clinically relevant metrics is introduced to benchmark performance beyond traditional accuracy, and the authors discuss potential biases from learned interdependencies and directions for ontology-informed improvements. Overall, the approach achieves strong, interpretable performance on ChestX-ray8 and highlights the practical value of dependency-aware decoding in medical imaging.

Abstract

The field of medical diagnostics contains a wealth of challenges which closely resemble classical machine learning problems; practical constraints, however, complicate the translation of these endpoints naively into classical architectures. Many tasks in radiology, for example, are largely problems of multi-label classification wherein medical images are interpreted to indicate multiple present or suspected pathologies. Clinical settings drive the necessity for high accuracy simultaneously across a multitude of pathological outcomes and greatly limit the utility of tools which consider only a subset. This issue is exacerbated by a general scarcity of training data and maximizes the need to extract clinically relevant features from available samples -- ideally without the use of pre-trained models which may carry forward undesirable biases from tangentially related tasks. We present and evaluate a partial solution to these constraints in using LSTMs to leverage interdependencies among target labels in predicting 14 pathologic patterns from chest x-rays and establish state of the art results on the largest publicly available chest x-ray dataset from the NIH without pre-training. Furthermore, we propose and discuss alternative evaluation metrics and their relevance in clinical practice.

Paper Structure

This paper contains 21 sections, 9 equations, 1 figure, 3 tables.

Figures (1)

  • Figure 1: The input image is encoded by a densely connected convolutional neural network (top). Similar to DenseNets from huang2017densely, our variant consists of DenseBlocks and TransitionBlocks. Within each DenseBlock, there are several ConvBlocks. The resulting encoded representation of the input is a vector that captures the higher-order semantics that are useful for the decoding task. $K$ is the growth rate in huang2017densely, $S$ is the stride. We also include the filter and pooling dimensionality when applicable. Unlike a DenseNet that has 16 to 32 ConvBlock within a DenseBlock, our model uses 4 in order to keep the total number of parameters small. Our proposed RNN decoder is illustrated on the bottom right.