MapExplorer: New Content Generation from Low-Dimensional Visualizations
Xingjian Zhang, Ziyang Xiong, Shixuan Liu, Yutong Xie, Tolga Ergen, Dongsub Shim, Hua Xu, Honglak Lee, Qiaozhu Me
TL;DR
MapExplorer defines a novel task that generates text conditioned on coordinates in a 2D projection map, enabling exploration of unknown regions in data spaces. It introduces Atometric, an entailment-based, atomic-statement metric with multiple strictness levels to assess generated content against references, addressing limitations of n-gram and embedding-based metrics. The study compares three design families—Direct Mapping, Intermediate High-Dimensional Embedding, and Two-Stage Text Generation—across five diverse visualization datasets, revealing strengths of retrieval-augmented prompts and embedding inversion in different contexts. The work demonstrates MapExplorer’s potential for scientific idea generation and LLM red-teaming, and provides an online demo, while outlining limitations and future directions for more capable and generalized methods.
Abstract
Low-dimensional visualizations, or "projection maps," are widely used in scientific and creative domains to interpret large-scale and complex datasets. These visualizations not only aid in understanding existing knowledge spaces but also implicitly guide exploration into unknown areas. Although techniques such as t-SNE and UMAP can generate these maps, there exists no systematic method for leveraging them to generate new content. To address this, we introduce MapExplorer, a novel knowledge discovery task that translates coordinates within any projection map into coherent, contextually aligned textual content. This allows users to interactively explore and uncover insights embedded in the maps. To evaluate the performance of MapExplorer methods, we propose Atometric, a fine-grained metric inspired by ROUGE that quantifies logical coherence and alignment between generated and reference text. Experiments on diverse datasets demonstrate the versatility of MapExplorer in generating scientific hypotheses, crafting synthetic personas, and devising strategies for attacking large language models-even with simple baseline methods. By bridging visualization and generation, our work highlights the potential of MapExplorer to enable intuitive human-AI collaboration in large-scale data exploration.
