Table of Contents
Fetching ...

Improved Kidney Stone Recognition Through Attention and Multi-View Feature Fusion Strategies

Elias Villalvazo-Avila, Francisco Lopez-Tiro, Jonathan El-Beze, Jacques Hubert, Miguel Gonzalez-Mendoza, Gilberto Ochoa-Ruiz, Christian Daul

TL;DR

This approach is specifically designed to mimic the morpho-constitutional analysis performed in ex-vivo by biologists to visually identify kidney stones by inspecting both views.

Abstract

This contribution presents a deep learning method for the extraction and fusion of information relating to kidney stone fragments acquired from different viewpoints of the endoscope. Surface and section fragment images are jointly used during the training of the classifier to improve the discrimination power of the features by adding attention layers at the end of each convolutional block. This approach is specifically designed to mimic the morpho-constitutional analysis performed in ex-vivo by biologists to visually identify kidney stones by inspecting both views. The addition of attention mechanisms to the backbone improved the results of single view extraction backbones by 4% on average. Moreover, in comparison to the state-of-the-art, the fusion of the deep features improved the overall results up to 11% in terms of kidney stone classification accuracy.

Improved Kidney Stone Recognition Through Attention and Multi-View Feature Fusion Strategies

TL;DR

This approach is specifically designed to mimic the morpho-constitutional analysis performed in ex-vivo by biologists to visually identify kidney stones by inspecting both views.

Abstract

This contribution presents a deep learning method for the extraction and fusion of information relating to kidney stone fragments acquired from different viewpoints of the endoscope. Surface and section fragment images are jointly used during the training of the classifier to improve the discrimination power of the features by adding attention layers at the end of each convolutional block. This approach is specifically designed to mimic the morpho-constitutional analysis performed in ex-vivo by biologists to visually identify kidney stones by inspecting both views. The addition of attention mechanisms to the backbone improved the results of single view extraction backbones by 4% on average. Moreover, in comparison to the state-of-the-art, the fusion of the deep features improved the overall results up to 11% in terms of kidney stone classification accuracy.
Paper Structure (10 sections, 3 figures, 3 tables)

This paper contains 10 sections, 3 figures, 3 tables.

Figures (3)

  • Figure 1: Examples of kidney stone images of the used dataset el2022evaluation. The latter consists of the six most common kidney stone types, namely whewellite (WW), weddellite (WD), uric acid (AU), struvite (STR), brushite (BRU), and cystine (CYS).
  • Figure 2: Proposed Multi-View model with attention. The first part of the model corresponds to the duplicated feature extraction layers from the ResNet50 model. These layers are followed by the fusion layer, which combines information from the two views (i.e., from the two image types). The fused feature map is then connected to the classification layer.
  • Figure 3: UMAP visualizations of the features extracted by the models. (a) No-attention mixed model (Mixed Base model), (b) Mixed Base model + Attention, and (c) MV model (max pool) + Attention. See \ref{['tab:results1']} for more details about the trained models.