DataMap: A Portable Application for Visualizing High-Dimensional Data
Xijin Ge
TL;DR
DataMap addresses the need for secure, scalable, and reproducible visualization of high-dimensional biomedical data by delivering a browser-based, serverless Shiny application powered by WebAssembly and WebR. It provides heatmaps, PCA, and t-SNE visualizations with automatic data import, preprocessing, and annotation support, and automatically generates reproducible R code for all analyses. The approach preserves data privacy by performing entirely client-side processing while delivering publication-grade graphics, though it incurs slower performance than native implementations. Key contributions include a comprehensive client-side visualization workflow, automatic preprocessing heuristics, and end-to-end reproducible code generation, with practical impact for omics and other high-dimensional datasets. Future work aims to optimize performance, broaden visualization capabilities, and extend analytical modules while acknowledging WebR package constraints.
Abstract
Motivation: The visualization and analysis of high-dimensional data are essential in biomedical research. There is a need for secure, scalable, and reproducible tools to facilitate data exploration and interpretation. Results: We introduce DataMap, a browser-based application for visualization of high-dimensional data using heatmaps, principal component analysis (PCA), and t-distributed stochastic neighbor embedding (t-SNE). DataMap runs in the web browser, ensuring data privacy while eliminating the need for installation or a server. The application has an intuitive user interface for data transformation, annotation, and generation of reproducible R code. Availability and Implementation: Freely available as a GitHub page https://gexijin.github.io/datamap/. The source code can be found at https://github.com/gexijin/datamap, and can also be installed as an R package. Contact: Xijin.Ge@sdstate.ed
