Table of Contents
Fetching ...

Density-based Isometric Mapping

Bardia Yousefi, Mélina Khansari, Ryan Trask, Patrick Tallon, Carina Carino, Arman Afrasiyabi, Vikas Kundra, Lan Ma, Lei Ren, Keyvan Farahani, Michelle Hershman

TL;DR

The paper addresses overestimation of geodesic distances in Isomap when handling nonuniform high-dimensional manifolds by introducing PR-Isomap, a density-aware variant that adds a Parzen–Rosenblatt window constraint to the k-NN graph. This yields LD embeddings that preserve both local and global distances while enforcing data uniformity, demonstrated across 72,236 cases from MNIST, chest X-ray pneumonia, and NSCLC CT/PET radiogenomics datasets. PR-Isomap consistently outperforms baselines (t-SNE, PCA, Isomap, PHATE) in classification accuracy and improves survival prediction in multivariate Cox models, with notable gains in pneumonia and NSCLC cohorts and clear Kaplan–Meier separation. The method shows promise for HD multimodal medical data analysis and precision medicine, offering a scalable approach to extract robust imaging biomarkers for diagnosis and prognosis.

Abstract

The isometric mapping method employs the shortest path algorithm to estimate the Euclidean distance between points on High dimensional (HD) manifolds. This may not be sufficient for weakly uniformed HD data as it could lead to overestimating distances between far neighboring points, resulting in inconsistencies between the intrinsic (local) and extrinsic (global) distances during the projection. To address this issue, we modify the shortest path algorithm by adding a novel constraint inspired by the Parzen-Rosenblatt (PR) window, which helps to maintain the uniformity of the constructed shortest-path graph in Isomap. Multiple imaging datasets overall of 72,236 cases, 70,000 MINST data, 1596 from multiple Chest-XRay pneumonia datasets, and three NSCLC CT/PET datasets with a total of 640 lung cancer patients, were used to benchmark and validate PR-Isomap. 431 imaging biomarkers were extracted from each modality. Our results indicate that PR-Isomap projects HD attributes into a lower-dimensional (LD) space while preserving information, visualized by the MNIST dataset indicating the maintaining local and global distances. PR-Isomap achieved the highest comparative accuracies of 80.9% (STD:5.8) for pneumonia and 78.5% (STD:4.4), 88.4% (STD:1.4), and 61.4% (STD:11.4) for three NSCLC datasets, with a confidence interval of 95% for outcome prediction. Similarly, the multivariate Cox model showed higher overall survival, measured with c-statistics and log-likelihood test, of PR-Isomap compared to other dimensionality reduction methods. Kaplan Meier survival curve also signifies the notable ability of PR-Isomap to distinguish between high-risk and low-risk patients using multimodal imaging biomarkers preserving HD imaging characteristics for precision medicine.

Density-based Isometric Mapping

TL;DR

The paper addresses overestimation of geodesic distances in Isomap when handling nonuniform high-dimensional manifolds by introducing PR-Isomap, a density-aware variant that adds a Parzen–Rosenblatt window constraint to the k-NN graph. This yields LD embeddings that preserve both local and global distances while enforcing data uniformity, demonstrated across 72,236 cases from MNIST, chest X-ray pneumonia, and NSCLC CT/PET radiogenomics datasets. PR-Isomap consistently outperforms baselines (t-SNE, PCA, Isomap, PHATE) in classification accuracy and improves survival prediction in multivariate Cox models, with notable gains in pneumonia and NSCLC cohorts and clear Kaplan–Meier separation. The method shows promise for HD multimodal medical data analysis and precision medicine, offering a scalable approach to extract robust imaging biomarkers for diagnosis and prognosis.

Abstract

The isometric mapping method employs the shortest path algorithm to estimate the Euclidean distance between points on High dimensional (HD) manifolds. This may not be sufficient for weakly uniformed HD data as it could lead to overestimating distances between far neighboring points, resulting in inconsistencies between the intrinsic (local) and extrinsic (global) distances during the projection. To address this issue, we modify the shortest path algorithm by adding a novel constraint inspired by the Parzen-Rosenblatt (PR) window, which helps to maintain the uniformity of the constructed shortest-path graph in Isomap. Multiple imaging datasets overall of 72,236 cases, 70,000 MINST data, 1596 from multiple Chest-XRay pneumonia datasets, and three NSCLC CT/PET datasets with a total of 640 lung cancer patients, were used to benchmark and validate PR-Isomap. 431 imaging biomarkers were extracted from each modality. Our results indicate that PR-Isomap projects HD attributes into a lower-dimensional (LD) space while preserving information, visualized by the MNIST dataset indicating the maintaining local and global distances. PR-Isomap achieved the highest comparative accuracies of 80.9% (STD:5.8) for pneumonia and 78.5% (STD:4.4), 88.4% (STD:1.4), and 61.4% (STD:11.4) for three NSCLC datasets, with a confidence interval of 95% for outcome prediction. Similarly, the multivariate Cox model showed higher overall survival, measured with c-statistics and log-likelihood test, of PR-Isomap compared to other dimensionality reduction methods. Kaplan Meier survival curve also signifies the notable ability of PR-Isomap to distinguish between high-risk and low-risk patients using multimodal imaging biomarkers preserving HD imaging characteristics for precision medicine.
Paper Structure (14 sections, 10 equations, 7 figures, 4 tables, 1 algorithm)

This paper contains 14 sections, 10 equations, 7 figures, 4 tables, 1 algorithm.

Figures (7)

  • Figure 1: Our PR-Isomap method deflates a high-dimensional manifold onto a low-dimensional representation while preserving their similarities. We generate the HD-manifold using NSCLC Radiogenomics dataset r29r30r31. We apply the standard t-SNE r11, Isomap r3, PCA r1, and our proposed PR-Isomap method to reduce the dimensionality of the data. PR-Isomap preserves similarity in low-dimensional embeddings and showed more promising results in predicting patients’ survival.
  • Figure 2: Workflow of the proposed approach. A constrained isometrically mapping method is used to reduce feature dimensionality. Then, a subset of radiomic signatures is used to predict the survival of the patients using the Cox proportional hazard multivariate model. Afterwards, the parameters of the system were frozen, and the same analyses in an independent NSCLC dataset.
  • Figure 3: Geometric interpretation of the shortest path and PR-Isomap under $\mathcal{N}_{k,h}\left(\gamma\left(\mathbf{x}_k\right)\right)$, Parzen–Rosenblatt window constraint on $k$-nearest neighbor windows on HD manifold (a) and in the LD space after PR-Isomap projection (b). $d_{Euc}(\mathbf{x}_a,\mathbf{x}_b)$ and $D_{g,\gamma}\left(\mathbf{x}_a,\mathbf{x}_b\right)\ $are Euclidian and geodesic distances between $\mathbf{x}_a$ and $\mathbf{x}_b$.
  • Figure 4: Tumors cluster in the two phenotypes. Visualizations of the original CT/PET images with tumors in the field of view for each dataset with ChestXRay pneumonia data. Tumors’ morphology and shape along with other features affect the quantitative attributes obtained from the selected contours.
  • Figure 5: Visualization of the data dimensionality tested on CT radiomics of NSCLC Radiogenomics dataset. a) Representation of HD radiomics, $R^(130×431)$, presented using correlative distance. b) LD demonstration of the HD radiomics projected in the LD space using PR-Isomap. c) Representing the application of HD radiomics in a lower-dimensional space using the original Isomap method. d) Using PCA, projecting HD features onto LD space. e) t-SNE heatmap to make the same LD imaging biomarkers.
  • ...and 2 more figures