Chart2Vec: A Universal Embedding of Context-Aware Visualizations
Qing Chen, Ying Chen, Ruishi Zou, Wei Shuai, Yi Guo, Jiazhe Wang, Nan Cao
TL;DR
Chart2Vec introduces a context-aware universal embedding for visualizations by learning from declarative chart facts and their contextual co-occurrence in multi-view visualizations. It combines a CFG-based structural representation with Word2Vec-based semantic encoding, learned through a multi-task loss that fuses linear interpolation and triplet learning across four-chart inputs. The approach is evaluated with a large, carefully curated dataset of 849 data stories and 249 dashboards (6014 visualizations) and shows improvements over ChartSeer and Erato in retrieval and co-occurrence tasks, supported by a user study and ablation analyses. The work enables downstream tasks such as visualization recommendation and storytelling and suggests generalizability to other formats and real-world BI tools, marking a step toward scalable context-aware visualization intelligence.
Abstract
The advances in AI-enabled techniques have accelerated the creation and automation of visualizations in the past decade. However, presenting visualizations in a descriptive and generative format remains a challenge. Moreover, current visualization embedding methods focus on standalone visualizations, neglecting the importance of contextual information for multi-view visualizations. To address this issue, we propose a new representation model, Chart2Vec, to learn a universal embedding of visualizations with context-aware information. Chart2Vec aims to support a wide range of downstream visualization tasks such as recommendation and storytelling. Our model considers both structural and semantic information of visualizations in declarative specifications. To enhance the context-aware capability, Chart2Vec employs multi-task learning on both supervised and unsupervised tasks concerning the cooccurrence of visualizations. We evaluate our method through an ablation study, a user study, and a quantitative comparison. The results verified the consistency of our embedding method with human cognition and showed its advantages over existing methods.
