Table of Contents
Fetching ...

Preliminary Guidelines For Combining Data Integration and Visual Data Analysis

Adam Coscia, Ashley Suh, Remco Chang, Alex Endert

TL;DR

A preliminary user study to investigate whether and how data integration should be incorporated directly into the visual analytics process and synthesized preliminary guidelines for designing future visual analytics interfaces that can support integrating attributes throughout an active analysis process.

Abstract

Data integration is often performed to consolidate information from multiple disparate data sources during visual data analysis. However, integration operations are usually separate from visual analytics operations such as encode and filter in both interface design and empirical research. We conducted a preliminary user study to investigate whether and how data integration should be incorporated directly into the visual analytics process. We used two interface alternatives featuring contrasting approaches to the data preparation and analysis workflow: manual file-based ex-situ integration as a separate step from visual analytics operations; and automatic UI-based in-situ integration merged with visual analytics operations. Participants were asked to complete specific and free-form tasks with each interface, browsing for patterns, generating insights, and summarizing relationships between attributes distributed across multiple files. Analyzing participants' interactions and feedback, we found both task completion time and total interactions to be similar across interfaces and tasks, as well as unique integration strategies between interfaces and emergent behaviors related to satisficing and cognitive bias. Participants' time spent and interactions revealed that in-situ integration enabled users to spend more time on analysis tasks compared with ex-situ integration. Participants' integration strategies and analytical behaviors revealed differences in interface usage for generating and tracking hypotheses and insights. With these results, we synthesized preliminary guidelines for designing future visual analytics interfaces that can support integrating attributes throughout an active analysis process.

Preliminary Guidelines For Combining Data Integration and Visual Data Analysis

TL;DR

A preliminary user study to investigate whether and how data integration should be incorporated directly into the visual analytics process and synthesized preliminary guidelines for designing future visual analytics interfaces that can support integrating attributes throughout an active analysis process.

Abstract

Data integration is often performed to consolidate information from multiple disparate data sources during visual data analysis. However, integration operations are usually separate from visual analytics operations such as encode and filter in both interface design and empirical research. We conducted a preliminary user study to investigate whether and how data integration should be incorporated directly into the visual analytics process. We used two interface alternatives featuring contrasting approaches to the data preparation and analysis workflow: manual file-based ex-situ integration as a separate step from visual analytics operations; and automatic UI-based in-situ integration merged with visual analytics operations. Participants were asked to complete specific and free-form tasks with each interface, browsing for patterns, generating insights, and summarizing relationships between attributes distributed across multiple files. Analyzing participants' interactions and feedback, we found both task completion time and total interactions to be similar across interfaces and tasks, as well as unique integration strategies between interfaces and emergent behaviors related to satisficing and cognitive bias. Participants' time spent and interactions revealed that in-situ integration enabled users to spend more time on analysis tasks compared with ex-situ integration. Participants' integration strategies and analytical behaviors revealed differences in interface usage for generating and tracking hypotheses and insights. With these results, we synthesized preliminary guidelines for designing future visual analytics interfaces that can support integrating attributes throughout an active analysis process.
Paper Structure (21 sections, 5 figures)

This paper contains 21 sections, 5 figures.

Figures (5)

  • Figure 1: Data sets and tasks used in the study (Sect. \ref{['sec:data_sets_and_tasks']}). All data sets list each attribute's data type, percent complete (in italics), and total number of missing records (in parentheses), grouped under the files they are in.
  • Figure 2: Our two interfaces: (1) Separated (top; showing task CQ1); and (2) Combined (bottom; showing task LQ1) (Sect. \ref{['sec:experimental_systems']}).
  • Figure 3: Bootstrapped 95% CIs around the mean estimations of total task completion time (A) and percent time spent integrating between interfaces (B), as well as the time spent integrating organized by interface and task (C) (Sect. \ref{['sec:time_spent']}). Each estimate represents eight participants with 1000 resamples. The darker pink bars represent intervals of data integration and the vertical black lines are a proxy for showing when analysis started. The percent (%) of time spent integrating and number (#) of intervals of integration are shown to the right of each bar.
  • Figure 4: Bootstrapped 95% CIs around the mean estimations of (A) unique attributes added to the Attributes panel as well as total attribute interactions between interfaces in the (B) Encode and (C) Filter panels (Sect. \ref{['sec:interactions']}). Each estimate represents eight participants with 1000 resamples.
  • Figure 5: Total counts of primary and secondary attribute interactions in the Encode and Filter panels of the Combined interface (Sect. \ref{['sec:interactions']}). See Sect. \ref{['sec:combined_interface']} and Fig. \ref{['fig:experimental_systems']} for definitions.