Table of Contents
Fetching ...

An Ordinal Regression Framework for a Deep Learning Based Severity Assessment for Chest Radiographs

Patrick Wienholt, Alexander Hermans, Firas Khader, Behrus Puladi, Bastian Leibe, Christiane Kuhl, Sven Nebelung, Daniel Truhn

TL;DR

The paper tackles ordinal severity scoring in chest radiographs by introducing a modular ordinal-regression framework that separates the task into a model, a target function, and a classification function. It systematically compares multiple target encodings (One-Hot, Gaussian, Continuous, Progress-Bar, Soft-Progress-Bar, Binary) using ResNet50 and ViT-B-16, evaluated with unweighted and weighted Cohen's kappa ($kappa$) metrics. Key findings show no universally optimal method across metrics or architectures; One-Hot excels for unweighted kappa, while encodings that reflect ordinal structure (Gaussian, Progress-Bar, Soft-Progress-Bar) often perform better for weighted kappas. The work provides practical guidance for selecting encodings and weights in clinical settings and offers code for reproducible evaluation of ordinal regression in medical imaging.

Abstract

This study investigates the application of ordinal regression methods for categorizing disease severity in chest radiographs. We propose a framework that divides the ordinal regression problem into three parts: a model, a target function, and a classification function. Different encoding methods, including one-hot, Gaussian, progress-bar, and our soft-progress-bar, are applied using ResNet50 and ViT-B-16 deep learning models. We show that the choice of encoding has a strong impact on performance and that the best encoding depends on the chosen weighting of Cohen's kappa and also on the model architecture used. We make our code publicly available on GitHub.

An Ordinal Regression Framework for a Deep Learning Based Severity Assessment for Chest Radiographs

TL;DR

The paper tackles ordinal severity scoring in chest radiographs by introducing a modular ordinal-regression framework that separates the task into a model, a target function, and a classification function. It systematically compares multiple target encodings (One-Hot, Gaussian, Continuous, Progress-Bar, Soft-Progress-Bar, Binary) using ResNet50 and ViT-B-16, evaluated with unweighted and weighted Cohen's kappa () metrics. Key findings show no universally optimal method across metrics or architectures; One-Hot excels for unweighted kappa, while encodings that reflect ordinal structure (Gaussian, Progress-Bar, Soft-Progress-Bar) often perform better for weighted kappas. The work provides practical guidance for selecting encodings and weights in clinical settings and offers code for reproducible evaluation of ordinal regression in medical imaging.

Abstract

This study investigates the application of ordinal regression methods for categorizing disease severity in chest radiographs. We propose a framework that divides the ordinal regression problem into three parts: a model, a target function, and a classification function. Different encoding methods, including one-hot, Gaussian, progress-bar, and our soft-progress-bar, are applied using ResNet50 and ViT-B-16 deep learning models. We show that the choice of encoding has a strong impact on performance and that the best encoding depends on the chosen weighting of Cohen's kappa and also on the model architecture used. We make our code publicly available on GitHub.
Paper Structure (31 sections, 16 equations, 2 figures, 17 tables)

This paper contains 31 sections, 16 equations, 2 figures, 17 tables.

Figures (2)

  • Figure 1: In our framework, the regression task is divided into three parts: the model trained on the task, the target function that defines a vector for each class the model is trained on, and the classification function that maps the output of the trained model to one of the possible classes. During training, only the model and the target function are used, while during inference only the model and the classification function are used. The modular framework makes it possible to exchange the three parts.
  • Figure 2: The change of the ranking of the best methods of the ResNet50. It is possible to see which methods are better and which are worse, and which become worse when the weighting used for the kappa is changed.