Table of Contents
Fetching ...

Rethinking Knee Osteoarthritis Severity Grading: A Few Shot Self-Supervised Contrastive Learning Approach

Niamh Belton, Misgina Tsighe Hagos, Aonghus Lawlor, Kathleen M. Curran

TL;DR

The paper tackles the subjectivity of the Kellgren-Lawrence OA grading by proposing a continuous OA severity score derived from distance to a learned normal knee representation. It introduces SS-FewSOME, a self-supervised, few-shot framework that leverages a Stochastic Data Augmentation pipeline, patch-level representations from early AlexNet layers, and a cosine-similarity loss to model normality. The approach combines pseudo labeling on unlabelled data with CLIP-based denoising and retraining on a larger network to yield robust, continuous OA scores, achieving a $SRCC$ of $0.43$ and a $AUC$ of $91.2$ for severe OA detection, outperforming several baselines. These results suggest a practical path to low-data, low-annotation continuous OA grading with potential clinical impact by reducing subjectivity and enabling finer-grained assessments.

Abstract

Knee Osteoarthritis (OA) is a debilitating disease affecting over 250 million people worldwide. Currently, radiologists grade the severity of OA on an ordinal scale from zero to four using the Kellgren-Lawrence (KL) system. Recent studies have raised concern in relation to the subjectivity of the KL grading system, highlighting the requirement for an automated system, while also indicating that five ordinal classes may not be the most appropriate approach for assessing OA severity. This work presents preliminary results of an automated system with a continuous grading scale. This system, namely SS-FewSOME, uses self-supervised pre-training to learn robust representations of the features of healthy knee X-rays. It then assesses the OA severity by the X-rays' distance to the normal representation space. SS-FewSOME initially trains on only 'few' examples of healthy knee X-rays, thus reducing the barriers to clinical implementation by eliminating the need for large training sets and costly expert annotations that existing automated systems require. The work reports promising initial results, obtaining a positive Spearman Rank Correlation Coefficient of 0.43, having had access to only 30 ground truth labels at training time.

Rethinking Knee Osteoarthritis Severity Grading: A Few Shot Self-Supervised Contrastive Learning Approach

TL;DR

The paper tackles the subjectivity of the Kellgren-Lawrence OA grading by proposing a continuous OA severity score derived from distance to a learned normal knee representation. It introduces SS-FewSOME, a self-supervised, few-shot framework that leverages a Stochastic Data Augmentation pipeline, patch-level representations from early AlexNet layers, and a cosine-similarity loss to model normality. The approach combines pseudo labeling on unlabelled data with CLIP-based denoising and retraining on a larger network to yield robust, continuous OA scores, achieving a of and a of for severe OA detection, outperforming several baselines. These results suggest a practical path to low-data, low-annotation continuous OA grading with potential clinical impact by reducing subjectivity and enabling finer-grained assessments.

Abstract

Knee Osteoarthritis (OA) is a debilitating disease affecting over 250 million people worldwide. Currently, radiologists grade the severity of OA on an ordinal scale from zero to four using the Kellgren-Lawrence (KL) system. Recent studies have raised concern in relation to the subjectivity of the KL grading system, highlighting the requirement for an automated system, while also indicating that five ordinal classes may not be the most appropriate approach for assessing OA severity. This work presents preliminary results of an automated system with a continuous grading scale. This system, namely SS-FewSOME, uses self-supervised pre-training to learn robust representations of the features of healthy knee X-rays. It then assesses the OA severity by the X-rays' distance to the normal representation space. SS-FewSOME initially trains on only 'few' examples of healthy knee X-rays, thus reducing the barriers to clinical implementation by eliminating the need for large training sets and costly expert annotations that existing automated systems require. The work reports promising initial results, obtaining a positive Spearman Rank Correlation Coefficient of 0.43, having had access to only 30 ground truth labels at training time.
Paper Structure (3 sections, 1 figure, 1 table)

This paper contains 3 sections, 1 figure, 1 table.

Figures (1)

  • Figure 1: The figure visualises the self-supervised pre-training of SS-FewSOME as outlined in the text. The figure also shows two example X-rays with 'high' anomaly scores due to the presence of metal in the X-ray. As shown in the bottom right of the figure, the CLIP model, along with a dictionary of statements can be used to identify such X-rays and remove them from the pseudo labelled X-rays before model retraining.