Table of Contents
Fetching ...

Visual Analytics of Multivariate Networks with Representation Learning and Composite Variable Construction

Hsiao-Ying Lu, Takanori Fujiwara, Ming-Yi Chang, Yang-chih Fu, Anders Ynnerman, Kwan-Liu Ma

TL;DR

The paper tackles the challenge of uncovering associations in multivariate networks by proposing a visual analytics workflow that combines neural-network-based representation learning with targeted interpretability methods. It learns representations specific to a user-selected attribute through an NN trained on precomputed structural features, then compresses these into a 1D latent space via regularized LDA for interpretability. Attribute contributions are quantified with SHAP values, guiding interactive composite-variable construction that linearly combines multiple attributes to approximate the 1D representation, enabling intuitive explanations of complex relationships. A novel two-class density scatterplot visualizes class separation, density, and correlations, and three case studies on social-network datasets demonstrate the approach’s practical utility and expert-validated usability.

Abstract

Multivariate networks are commonly found in real-world data-driven applications. Uncovering and understanding the relations of interest in multivariate networks is not a trivial task. This paper presents a visual analytics workflow for studying multivariate networks to extract associations between different structural and semantic characteristics of the networks (e.g., what are the combinations of attributes largely relating to the density of a social network?). The workflow consists of a neural-network-based learning phase to classify the data based on the chosen input and output attributes, a dimensionality reduction and optimization phase to produce a simplified set of results for examination, and finally an interpreting phase conducted by the user through an interactive visualization interface. A key part of our design is a composite variable construction step that remodels nonlinear features obtained by neural networks into linear features that are intuitive to interpret. We demonstrate the capabilities of this workflow with multiple case studies on networks derived from social media usage and also evaluate the workflow with qualitative feedback from experts.

Visual Analytics of Multivariate Networks with Representation Learning and Composite Variable Construction

TL;DR

The paper tackles the challenge of uncovering associations in multivariate networks by proposing a visual analytics workflow that combines neural-network-based representation learning with targeted interpretability methods. It learns representations specific to a user-selected attribute through an NN trained on precomputed structural features, then compresses these into a 1D latent space via regularized LDA for interpretability. Attribute contributions are quantified with SHAP values, guiding interactive composite-variable construction that linearly combines multiple attributes to approximate the 1D representation, enabling intuitive explanations of complex relationships. A novel two-class density scatterplot visualizes class separation, density, and correlations, and three case studies on social-network datasets demonstrate the approach’s practical utility and expert-validated usability.

Abstract

Multivariate networks are commonly found in real-world data-driven applications. Uncovering and understanding the relations of interest in multivariate networks is not a trivial task. This paper presents a visual analytics workflow for studying multivariate networks to extract associations between different structural and semantic characteristics of the networks (e.g., what are the combinations of attributes largely relating to the density of a social network?). The workflow consists of a neural-network-based learning phase to classify the data based on the chosen input and output attributes, a dimensionality reduction and optimization phase to produce a simplified set of results for examination, and finally an interpreting phase conducted by the user through an interactive visualization interface. A key part of our design is a composite variable construction step that remodels nonlinear features obtained by neural networks into linear features that are intuitive to interpret. We demonstrate the capabilities of this workflow with multiple case studies on networks derived from social media usage and also evaluate the workflow with qualitative feedback from experts.
Paper Structure (27 sections, 10 figures)

This paper contains 27 sections, 10 figures.

Figures (10)

  • Figure 1: The visual analytics workflow for investigating associations in multivariate networks, where Steps 1--4 are executed with a script for machine learning and Steps 5--6 are conducted interactively with our UI.
  • Figure 2: The visual interface for facilitating interactive analysis using the workflow in \ref{['fig:workflow']}. (a) Relating to Step 4, this view visualizes input attributes' contributions to the 1D network representation. (b) Composite variables generated in Step 5 are shown as a list of scatterplots, where the 1D network representation and composite variable correspond to $x$- and $y$-axes, respectively (see b2). As shown in b1, when a composite variable is not generated yet, a swarm-plot-like visualization presents the 1D network representation generated through Step 3 to help assess its quality. For a and b, we employ our two-class density scatterplots. (c) A node-link diagram and (d) a set of histograms inform the network structure and attribute distributions. (e) Other auxiliary information is displayed, including the prediction accuracy of NNs trained for Step 2.
  • Figure 3: The comparison of scatterplot designs: (a) scatterplot with colored classes, (b) density scatterplot, (c,d) scatterplots encoding the total density and the ratio of each class's density with two different bivariate colormaps. (d) is our final design for a two-class density scatterplot.
  • Figure 4: 1D representations generated by alternative designs. The same dataset and visualization as \ref{['fig:ui']}-b1 are used (red: Class 0, blue: Class 1). (a) is made by applying a softmax activation function at the last layer of an MLP. (b) is generated by directly applying LDA to the dataset.
  • Figure 5: 1D representations generated by training the same NRL architecture as the \ref{['fig:ui']}-b1. Unlike \ref{['fig:ui']}-b1, the NRL architecture was trained with a subset of the data (616 instances). (a) and (b) correspond to the 1D representations of the training set and the remaining test set (170 instances).
  • ...and 5 more figures