Table of Contents
Fetching ...

HiRegEx: Interactive Visual Query and Exploration of Multivariate Hierarchical Data

Guozheng Li, Haotian Mi, Chi Harold Liu, Takayuki Itoh, Guoren Wang

TL;DR

HiRegEx introduces a declarative grammar for querying multivariate hierarchical data, addressing challenges in exploratory visual analysis by enabling node, path, and subtree queries with features and positions. Grounded in the e-MLTT task framework, it combines regex-inspired operators with hierarchical extensions and an Element Composition mechanism to support diverse tasks. The TreeQueryER prototype integrates a top-down visual editor, bottom-up recommendations, and a context-creation overview, validated by a case study on citation trees and an expressiveness assessment across 213 e-MLTT tasks. Together, these contributions enable interactive, scalable EVA for large hierarchical datasets and offer a practical workflow for constructing and refining targeted queries.

Abstract

When using exploratory visual analysis to examine multivariate hierarchical data, users often need to query data to narrow down the scope of analysis. However, formulating effective query expressions remains a challenge for multivariate hierarchical data, particularly when datasets become very large. To address this issue, we develop a declarative grammar, HiRegEx (Hierarchical data Regular Expression), for querying and exploring multivariate hierarchical data. Rooted in the extended multi-level task topology framework for tree visualizations (e-MLTT), HiRegEx delineates three query targets (node, path, and subtree) and two aspects for querying these targets (features and positions), and uses operators developed based on classical regular expressions for query construction. Based on the HiRegEx grammar, we develop an exploratory framework for querying and exploring multivariate hierarchical data and integrate it into the TreeQueryER prototype system. The exploratory framework includes three major components: top-down pattern specification, bottom-up data-driven inquiry, and context-creation data overview. We validate the expressiveness of HiRegEx with the tasks from the e-MLTT framework and showcase the utility and effectiveness of TreeQueryER system through a case study involving expert users in the analysis of a citation tree dataset.

HiRegEx: Interactive Visual Query and Exploration of Multivariate Hierarchical Data

TL;DR

HiRegEx introduces a declarative grammar for querying multivariate hierarchical data, addressing challenges in exploratory visual analysis by enabling node, path, and subtree queries with features and positions. Grounded in the e-MLTT task framework, it combines regex-inspired operators with hierarchical extensions and an Element Composition mechanism to support diverse tasks. The TreeQueryER prototype integrates a top-down visual editor, bottom-up recommendations, and a context-creation overview, validated by a case study on citation trees and an expressiveness assessment across 213 e-MLTT tasks. Together, these contributions enable interactive, scalable EVA for large hierarchical datasets and offer a practical workflow for constructing and refining targeted queries.

Abstract

When using exploratory visual analysis to examine multivariate hierarchical data, users often need to query data to narrow down the scope of analysis. However, formulating effective query expressions remains a challenge for multivariate hierarchical data, particularly when datasets become very large. To address this issue, we develop a declarative grammar, HiRegEx (Hierarchical data Regular Expression), for querying and exploring multivariate hierarchical data. Rooted in the extended multi-level task topology framework for tree visualizations (e-MLTT), HiRegEx delineates three query targets (node, path, and subtree) and two aspects for querying these targets (features and positions), and uses operators developed based on classical regular expressions for query construction. Based on the HiRegEx grammar, we develop an exploratory framework for querying and exploring multivariate hierarchical data and integrate it into the TreeQueryER prototype system. The exploratory framework includes three major components: top-down pattern specification, bottom-up data-driven inquiry, and context-creation data overview. We validate the expressiveness of HiRegEx with the tasks from the e-MLTT framework and showcase the utility and effectiveness of TreeQueryER system through a case study involving expert users in the analysis of a citation tree dataset.
Paper Structure (19 sections, 6 equations, 8 figures)

This paper contains 19 sections, 6 equations, 8 figures.

Figures (8)

  • Figure 1: The formal specification of the HiRegEx declarative grammar. The first row introduces the overall structure of the HiRegEx specification. Rows 2 to 5 below present various query targets on the left, grounded in the e-MLTT framework, while the right side details element compositions that enable users to specify how these query targets are structured.
  • Figure 2: The explanations of Branch operator in the HiRegEx. The nodes in blue indicate the matched part with the HiRegEx expression, while the nodes in red indicate the unmatched part.
  • Figure 3: The exploratory framework for querying multivariate hierarchical data comprises three modes: top-down, bottom-up, and context-creation. The top-down mode starts from a clear query task. Users construct the corresponding query expression through direct manipulations interactively. The bottom-up mode recommends related query expressions based on the initial expression and the multivariate hierarchical data collection. The context-creation mode offers users an overview of the entire hierarchical data collection. Modules associated with the top-down, bottom-up, and context creation modes in the framework are denoted by red, orange, and blue triangles.
  • Figure 4: The delete and insert operations for computing tree edit distance.
  • Figure 5: Three visual operators in the query expression: (a) Node, (b) Path, and (c) Branch, which are the basic components of query expressions.
  • ...and 3 more figures