Density-based Isometric Mapping
Bardia Yousefi, Mélina Khansari, Ryan Trask, Patrick Tallon, Carina Carino, Arman Afrasiyabi, Vikas Kundra, Lan Ma, Lei Ren, Keyvan Farahani, Michelle Hershman
TL;DR
The paper addresses overestimation of geodesic distances in Isomap when handling nonuniform high-dimensional manifolds by introducing PR-Isomap, a density-aware variant that adds a Parzen–Rosenblatt window constraint to the k-NN graph. This yields LD embeddings that preserve both local and global distances while enforcing data uniformity, demonstrated across 72,236 cases from MNIST, chest X-ray pneumonia, and NSCLC CT/PET radiogenomics datasets. PR-Isomap consistently outperforms baselines (t-SNE, PCA, Isomap, PHATE) in classification accuracy and improves survival prediction in multivariate Cox models, with notable gains in pneumonia and NSCLC cohorts and clear Kaplan–Meier separation. The method shows promise for HD multimodal medical data analysis and precision medicine, offering a scalable approach to extract robust imaging biomarkers for diagnosis and prognosis.
Abstract
The isometric mapping method employs the shortest path algorithm to estimate the Euclidean distance between points on High dimensional (HD) manifolds. This may not be sufficient for weakly uniformed HD data as it could lead to overestimating distances between far neighboring points, resulting in inconsistencies between the intrinsic (local) and extrinsic (global) distances during the projection. To address this issue, we modify the shortest path algorithm by adding a novel constraint inspired by the Parzen-Rosenblatt (PR) window, which helps to maintain the uniformity of the constructed shortest-path graph in Isomap. Multiple imaging datasets overall of 72,236 cases, 70,000 MINST data, 1596 from multiple Chest-XRay pneumonia datasets, and three NSCLC CT/PET datasets with a total of 640 lung cancer patients, were used to benchmark and validate PR-Isomap. 431 imaging biomarkers were extracted from each modality. Our results indicate that PR-Isomap projects HD attributes into a lower-dimensional (LD) space while preserving information, visualized by the MNIST dataset indicating the maintaining local and global distances. PR-Isomap achieved the highest comparative accuracies of 80.9% (STD:5.8) for pneumonia and 78.5% (STD:4.4), 88.4% (STD:1.4), and 61.4% (STD:11.4) for three NSCLC datasets, with a confidence interval of 95% for outcome prediction. Similarly, the multivariate Cox model showed higher overall survival, measured with c-statistics and log-likelihood test, of PR-Isomap compared to other dimensionality reduction methods. Kaplan Meier survival curve also signifies the notable ability of PR-Isomap to distinguish between high-risk and low-risk patients using multimodal imaging biomarkers preserving HD imaging characteristics for precision medicine.
