Table of Contents
Fetching ...

A Systematic Review of Low-Rank and Local Low-Rank Matrix Approximation in Big Data Medical Imaging

Sisipho Hamlomo, Marcellin Atemkeng, Yusuf Brima, Chuneeta Nunhokee, Jeremy Baxter

TL;DR

The paper surveys the literature on low-rank and local low-rank matrix approximation in medical imaging, documenting a shift from LORMA to LLORMA after 2015 and highlighting LLORMA’s ability to capture local structure while reducing computational demands. It evaluates a wide range of modalities (MRI, CT, X-ray, ultrasound, PET, multispectral and retinal imaging) and datasets, analyzes similarity-measure approaches, and discusses limitations of shallow patch-similarity methods. The authors advocate advancing semantic similarity via deep segmentation models (e.g., DeepLab), extending LRMA to structured and semi-structured data, and adopting hybrid patch-size optimization strategies combining random search with Bayesian methods. Their findings underscore LLORMA’s practical impact on denoising, reconstruction, and fusion in medical imaging, while outlining future directions to improve scalability, data-type coverage, and similarity measurement. Overall, the work provides a comprehensive roadmap for applying and extending LRMA/LLORMA techniques to diverse medical datasets and tasks, with potential for real-world clinical impact through improved data quality and efficiency.

Abstract

The large volume and complexity of medical imaging datasets are bottlenecks for storage, transmission, and processing. To tackle these challenges, the application of low-rank matrix approximation (LRMA) and its derivative, local LRMA (LLRMA) has demonstrated potential. A detailed analysis of the literature identifies LRMA and LLRMA methods applied to various imaging modalities, and the challenges and limitations associated with existing LRMA and LLRMA methods are addressed. We note a significant shift towards a preference for LLRMA in the medical imaging field since 2015, demonstrating its potential and effectiveness in capturing complex structures in medical data compared to LRMA. Acknowledging the limitations of shallow similarity methods used with LLRMA, we suggest advanced semantic image segmentation for similarity measure, explaining in detail how it can be used to measure similar patches and its feasibility. We note that LRMA and LLRMA are mainly applied to unstructured medical data, and we propose extending their application to different medical data types, including structured and semi-structured. This paper also discusses how LRMA and LLRMA can be applied to regular data with missing entries and the impact of inaccuracies in predicting missing values and their effects. We discuss the impact of patch size and propose the use of random search (RS) to determine the optimal patch size. To enhance feasibility, a hybrid approach using Bayesian optimization and RS is proposed, which could improve the application of LRMA and LLRMA in medical imaging.

A Systematic Review of Low-Rank and Local Low-Rank Matrix Approximation in Big Data Medical Imaging

TL;DR

The paper surveys the literature on low-rank and local low-rank matrix approximation in medical imaging, documenting a shift from LORMA to LLORMA after 2015 and highlighting LLORMA’s ability to capture local structure while reducing computational demands. It evaluates a wide range of modalities (MRI, CT, X-ray, ultrasound, PET, multispectral and retinal imaging) and datasets, analyzes similarity-measure approaches, and discusses limitations of shallow patch-similarity methods. The authors advocate advancing semantic similarity via deep segmentation models (e.g., DeepLab), extending LRMA to structured and semi-structured data, and adopting hybrid patch-size optimization strategies combining random search with Bayesian methods. Their findings underscore LLORMA’s practical impact on denoising, reconstruction, and fusion in medical imaging, while outlining future directions to improve scalability, data-type coverage, and similarity measurement. Overall, the work provides a comprehensive roadmap for applying and extending LRMA/LLORMA techniques to diverse medical datasets and tasks, with potential for real-world clinical impact through improved data quality and efficiency.

Abstract

The large volume and complexity of medical imaging datasets are bottlenecks for storage, transmission, and processing. To tackle these challenges, the application of low-rank matrix approximation (LRMA) and its derivative, local LRMA (LLRMA) has demonstrated potential. A detailed analysis of the literature identifies LRMA and LLRMA methods applied to various imaging modalities, and the challenges and limitations associated with existing LRMA and LLRMA methods are addressed. We note a significant shift towards a preference for LLRMA in the medical imaging field since 2015, demonstrating its potential and effectiveness in capturing complex structures in medical data compared to LRMA. Acknowledging the limitations of shallow similarity methods used with LLRMA, we suggest advanced semantic image segmentation for similarity measure, explaining in detail how it can be used to measure similar patches and its feasibility. We note that LRMA and LLRMA are mainly applied to unstructured medical data, and we propose extending their application to different medical data types, including structured and semi-structured. This paper also discusses how LRMA and LLRMA can be applied to regular data with missing entries and the impact of inaccuracies in predicting missing values and their effects. We discuss the impact of patch size and propose the use of random search (RS) to determine the optimal patch size. To enhance feasibility, a hybrid approach using Bayesian optimization and RS is proposed, which could improve the application of LRMA and LLRMA in medical imaging.
Paper Structure (46 sections, 33 equations, 13 figures, 16 tables, 4 algorithms)

This paper contains 46 sections, 33 equations, 13 figures, 16 tables, 4 algorithms.

Figures (13)

  • Figure 1: Prisma flow diagram depicting the study selection process for the systematic literature review.
  • Figure 2: The Euclidean distance is a commonly used measure of similarity between two points in a Euclidean space. It is defined as the square root of the sum of the squared differences between the corresponding coordinates of the two points as shown in this illustration where $\mathbf{a}$ and $\mathbf{b}$ are two vectors in $\mathbb{R}^2$. These can be, for example, representations of two MRI scans as depicted in this figure. However, this notion generalizes to $\mathbb{R}^n$.
  • Figure 3: The dot product between two vectors ($\mathbf{a}$ and $\mathbf{b}$), is defined as the sum of the products of their corresponding components. As shown in the figure, it is related to the Euclidean distance.
  • Figure 4: (Left) A 2D feature visualization of a dataset that comprises three classes ($C_1, C2, C_3$). The colors indicate the class labels. With a new datapoint $X_\mathrm{new}$, the objective of the K-NN algorithm is to find the class $C_i$ to which the datapoint can be classified based on its $K$ neighbors labels. (Right) After computing the distance between the query $X_\mathrm{new}$ and all the samples in the dataset, it has been classified into class $C_3$, which comprises a majority of the new sample neighbors.
  • Figure 5: A visual illustration of $K$-Means clustering on a $1D$ dataset where we have $n=14$ samples and the cluster centriods, $K=3$. In this formulation, the initial centroids are uniformly randomly chosen as indicated by the vector means $\large{\bm{\mu}} \sim \mathcal{U}(\mathcal{D}, K)$ as indicated in step 1. The distance $d_E(x_i, \large{\bm{\mu}}_i)$ is computed and the sample $x_i$ is assigned to the cluster with the minimum distance as shown in the figure where sample $x_1$ is assigned to cluster $1$. The new cluster centroids as computed from the updated cluster assignment such that the mean $\large{\bm{\mu}}_k$ of cluster $k$ is all samples that belong to that cluster. This step is repeated until convergence. Finally, all samples have been properly clustered as indicated by step n in the figure where the colors indicate the clusters to which each data point belongs.
  • ...and 8 more figures