Table of Contents
Fetching ...

Can VLMs Assess Similarity Between Graph Visualizations?

Seokweon Jung, Hyeon Jeon, Jeongmin Rhee, Jinwook Seo

TL;DR

This work investigates whether Vision Language Models (VLMs) can approximate human perceptual similarity judgments for graph visualizations and align with established graph similarity measures. Using gpt-4o prompts, the authors obtain a similarity score in [0,1] for graph image pairs and compare it against six feature-based metrics computed from graph attributes such as $|V|$, $|E|$, and distributions like node degree, clustering, betweenness, and community structure, via Pearson correlation. Across varying graph sizes and densities, they observe strong correlations ($r \ge 0.8$) between VLM judgments and feature-based measures, with the strength modulated by size and density; high-level features like community distribution $Cm$ and betweenness centrality $Bc$ show stronger alignment in larger, denser graphs. The results suggest that VLMs can serve as an approximate perceptual baseline for graph similarity, while highlighting differences among measures and motivating future human-perception studies to refine the mapping between visual perception and traditional graph similarity metrics.

Abstract

Graph visualizations have been studied for tasks such as clustering and temporal analysis, but how these visual similarities relate to established graph similarity measures remains unclear. In this paper, we explore the potential of Vision Language Models (VLMs) to approximate human-like perception of graph similarity. We generate graph datasets of various sizes and densities and compare VLM-derived visual similarity scores with feature-based measures. Our findings indicate VLMs can assess graph similarity in a manner similar to feature-based measures, even though differences among the measures exist. In future work, we plan to extend our research by conducting experiments on human visual graph perception.

Can VLMs Assess Similarity Between Graph Visualizations?

TL;DR

This work investigates whether Vision Language Models (VLMs) can approximate human perceptual similarity judgments for graph visualizations and align with established graph similarity measures. Using gpt-4o prompts, the authors obtain a similarity score in [0,1] for graph image pairs and compare it against six feature-based metrics computed from graph attributes such as , , and distributions like node degree, clustering, betweenness, and community structure, via Pearson correlation. Across varying graph sizes and densities, they observe strong correlations () between VLM judgments and feature-based measures, with the strength modulated by size and density; high-level features like community distribution and betweenness centrality show stronger alignment in larger, denser graphs. The results suggest that VLMs can serve as an approximate perceptual baseline for graph similarity, while highlighting differences among measures and motivating future human-perception studies to refine the mapping between visual perception and traditional graph similarity metrics.

Abstract

Graph visualizations have been studied for tasks such as clustering and temporal analysis, but how these visual similarities relate to established graph similarity measures remains unclear. In this paper, we explore the potential of Vision Language Models (VLMs) to approximate human-like perception of graph similarity. We generate graph datasets of various sizes and densities and compare VLM-derived visual similarity scores with feature-based measures. Our findings indicate VLMs can assess graph similarity in a manner similar to feature-based measures, even though differences among the measures exist. In future work, we plan to extend our research by conducting experiments on human visual graph perception.

Paper Structure

This paper contains 8 sections, 1 figure, 1 table.

Figures (1)

  • Figure 1: Pearson correlation coefficient between six similarity measures in \ref{['tab:algorithms']} and gpt-4o.