Table of Contents
Fetching ...

Evaluation of Few-Shot Learning Methods for Kidney Stone Type Recognition in Ureteroscopy

Carlos Salazar-Ruiz, Francisco Lopez-Tiro, Ivan Reyes-Amezcua, Clement Larose, Gilberto Ochoa-Ruiz, Christian Daul

TL;DR

This work tackles the challenge of kidney stone type recognition during ureteroscopy, where acquiring large labeled datasets is difficult. It adopts Few-Shot Learning using Prototypical Networks with ImageNet-pretrained ResNet backbones to classify six stone subtypes from ex vivo endoscopic patches across surface, section, and mixed views. The results show that Prototypical Networks can outperform traditional DL models even when using as little as 25% of the training data, with the 6-way 10-shot configuration and ResNet-34 backbone delivering robust performance across views. This data-efficient approach has potential to enable real-time, automated stone typing to guide treatment decisions and reduce reliance on time-consuming ex vivo analyses.

Abstract

Determining the type of kidney stones is crucial for prescribing appropriate treatments to prevent recurrence. Currently, various approaches exist to identify the type of kidney stones. However, obtaining results through the reference ex vivo identification procedure can take several weeks, while in vivo visual recognition requires highly trained specialists. For this reason, deep learning models have been developed to provide urologists with an automated classification of kidney stones during ureteroscopies. Nevertheless, a common issue with these models is the lack of training data. This contribution presents a deep learning method based on few-shot learning, aimed at producing sufficiently discriminative features for identifying kidney stone types in endoscopic images, even with a very limited number of samples. This approach was specifically designed for scenarios where endoscopic images are scarce or where uncommon classes are present, enabling classification even with a limited training dataset. The results demonstrate that Prototypical Networks, using up to 25% of the training data, can achieve performance equal to or better than traditional deep learning models trained with the complete dataset.

Evaluation of Few-Shot Learning Methods for Kidney Stone Type Recognition in Ureteroscopy

TL;DR

This work tackles the challenge of kidney stone type recognition during ureteroscopy, where acquiring large labeled datasets is difficult. It adopts Few-Shot Learning using Prototypical Networks with ImageNet-pretrained ResNet backbones to classify six stone subtypes from ex vivo endoscopic patches across surface, section, and mixed views. The results show that Prototypical Networks can outperform traditional DL models even when using as little as 25% of the training data, with the 6-way 10-shot configuration and ResNet-34 backbone delivering robust performance across views. This data-efficient approach has potential to enable real-time, automated stone typing to guide treatment decisions and reduce reliance on time-consuming ex vivo analyses.

Abstract

Determining the type of kidney stones is crucial for prescribing appropriate treatments to prevent recurrence. Currently, various approaches exist to identify the type of kidney stones. However, obtaining results through the reference ex vivo identification procedure can take several weeks, while in vivo visual recognition requires highly trained specialists. For this reason, deep learning models have been developed to provide urologists with an automated classification of kidney stones during ureteroscopies. Nevertheless, a common issue with these models is the lack of training data. This contribution presents a deep learning method based on few-shot learning, aimed at producing sufficiently discriminative features for identifying kidney stone types in endoscopic images, even with a very limited number of samples. This approach was specifically designed for scenarios where endoscopic images are scarce or where uncommon classes are present, enabling classification even with a limited training dataset. The results demonstrate that Prototypical Networks, using up to 25% of the training data, can achieve performance equal to or better than traditional deep learning models trained with the complete dataset.

Paper Structure

This paper contains 19 sections, 3 figures, 3 tables.

Figures (3)

  • Figure 1: Examples of endoscopic kidney stone images (acquired ex-vivo). From top to bottom: Rows 1 and 2, surface and section images, respectively. Rows 3 and 4, 256$\times$256 patches from rows 1 and 2, respectively.
  • Figure 2: Representation of the Prototypical Networks method. Prototypical Networks is composed of three steps: feature embedding, prototype initialization, and query prediction. In the feature embedding stage, embeddings are extracted from the support set data using a backbone network employed as an encoder, such as ResNet. In the prototype initialization step, prototypes are generated from the labeled data in the support set using the extracted embeddings. Finally, during query prediction, the prototypes generated from the support set are compared to the features of the query set using a similarity metric.
  • Figure 3: Qualitative comparison between (a) Prototypical Networks using 25% of the data and (b) traditional deep learning models (Traditional DL models) using 100% of the data.