Table of Contents
Fetching ...

Scalable Readability Evaluation for Graph Layouts: 2D Geometric Distributed Algorithms

Sanggeon Yun

TL;DR

Experimental results show that these distributed algorithms significantly reduce computation time, achieving up to a 17x speedup for node occlusion and a 146x improvement for edge crossing on large datasets.

Abstract

Graphs, consisting of vertices and edges, are vital for representing complex relationships in fields like social networks, finance, and blockchain. Visualizing these graphs helps analysts identify structural patterns, with readability metrics-such as node occlusion and edge crossing-assessing layout clarity. However, calculating these metrics is computationally intensive, making scalability a challenge for large graphs. Without efficient readability metrics, layout generation processes-despite numerous studies focused on accelerating them-face bottleneck, making it challenging to select or produce optimized layouts swiftly. Previous approaches attempted to accelerate this process through machine learning models. Machine learning approaches aimed to predict readability scores from rendered images of graphs. While these models offered some improvement, they struggled with scalability and accuracy, especially for graphs with thousands of nodes. For instance, this approach requires substantial memory to process large images, as it relies on rendered images of the graph; graphs with more than 600 nodes cannot be inputted into the model, and errors can exceed 55% in some readability metrics due to difficulties in generalizing across diverse graph layouts. This study addresses these limitations by introducing scalable algorithms for readability evaluation in distributed environments, utilizing Spark's DataFrame and GraphFrame frameworks to efficiently manage large data volumes across multiple machines. Experimental results show that these distributed algorithms significantly reduce computation time, achieving up to a 17x speedup for node occlusion and a 146x improvement for edge crossing on large datasets. These enhancements make scalable graph readability evaluation practical and efficient, overcoming the limitations of previous machine-learning approaches.

Scalable Readability Evaluation for Graph Layouts: 2D Geometric Distributed Algorithms

TL;DR

Experimental results show that these distributed algorithms significantly reduce computation time, achieving up to a 17x speedup for node occlusion and a 146x improvement for edge crossing on large datasets.

Abstract

Graphs, consisting of vertices and edges, are vital for representing complex relationships in fields like social networks, finance, and blockchain. Visualizing these graphs helps analysts identify structural patterns, with readability metrics-such as node occlusion and edge crossing-assessing layout clarity. However, calculating these metrics is computationally intensive, making scalability a challenge for large graphs. Without efficient readability metrics, layout generation processes-despite numerous studies focused on accelerating them-face bottleneck, making it challenging to select or produce optimized layouts swiftly. Previous approaches attempted to accelerate this process through machine learning models. Machine learning approaches aimed to predict readability scores from rendered images of graphs. While these models offered some improvement, they struggled with scalability and accuracy, especially for graphs with thousands of nodes. For instance, this approach requires substantial memory to process large images, as it relies on rendered images of the graph; graphs with more than 600 nodes cannot be inputted into the model, and errors can exceed 55% in some readability metrics due to difficulties in generalizing across diverse graph layouts. This study addresses these limitations by introducing scalable algorithms for readability evaluation in distributed environments, utilizing Spark's DataFrame and GraphFrame frameworks to efficiently manage large data volumes across multiple machines. Experimental results show that these distributed algorithms significantly reduce computation time, achieving up to a 17x speedup for node occlusion and a 146x improvement for edge crossing on large datasets. These enhancements make scalable graph readability evaluation practical and efficient, overcoming the limitations of previous machine-learning approaches.

Paper Structure

This paper contains 21 sections, 1 equation, 4 figures, 4 tables, 5 algorithms.

Figures (4)

  • Figure 1: Overview of enhanced readability evaluation algorithms. (A) Node occlusion overview. (B) Edge crossing overview. (C) Edge crossing angle overview. Each number in the circle indicates each step of the algorithm. Note that the first two steps of the edge crossing angle are omitted since they are the same as the first two steps of edge crossing.
  • Figure 2: Running time ratio by the number of vertices. Only readability evaluation algorithms whose running time is influenced by the number of vertices are shown. The dotted lines indicate fitted power functions. The grey line indicates $1\times$ improvement where running time becomes the same as $Greadability.js$.
  • Figure 3: Running time ratio by the number of edges. Only readability evaluation algorithms whose running time is influenced by the number of edges are shown. The dotted lines indicate fitted power functions. The grey line indicates $1\times$ improvement where running time becomes the same as $Greadability.js$.
  • Figure 4: Strong scalability of proposed readability evaluation algorithms on the musae-facebook dataset. The dotted lines on (a) indicate fitted exponential functions.