Using Counterfactuals to Improve Causal Inferences from Visualizations

David Borland; Arran Zeyu Wang; David Gotz

Using Counterfactuals to Improve Causal Inferences from Visualizations

David Borland, Arran Zeyu Wang, David Gotz

TL;DR

Visual data explorations often lead users to infer causality from correlations, risking invalid conclusions. The paper advocates counterfactual reasoning as a scalable approach to support visual causal inference and introduces the CoFact framework, which defines included, excluded, and counterfactual subsets to compare outcomes. It discusses limitations of traditional graph-based causal models (DAGs/DCGs) in large, complex data and outlines open challenges and future directions, including cognitive modeling, communication of counterfactuals, and measures for subset quality. The work emphasizes practical implications for designing visualization tools that help users make more credible causal judgments in data-rich settings.

Abstract

Traditional approaches to data visualization have often focused on comparing different subsets of data, and this is reflected in the many techniques developed and evaluated over the years for visual comparison. Similarly, common workflows for exploratory visualization are built upon the idea of users interactively applying various filter and grouping mechanisms in search of new insights. This paradigm has proven effective at helping users identify correlations between variables that can inform thinking and decision-making. However, recent studies show that consumers of visualizations often draw causal conclusions even when not supported by the data. Motivated by these observations, this article highlights recent advances from a growing community of researchers exploring methods that aim to directly support visual causal inference. However, many of these approaches have their own limitations which limit their use in many real-world scenarios. This article therefore also outlines a set of key open challenges and corresponding priorities for new research to advance the state of the art in visual causal inference.

Using Counterfactuals to Improve Causal Inferences from Visualizations

TL;DR

Abstract

Paper Structure (17 sections, 3 figures)

This paper contains 17 sections, 3 figures.

PERCEIVING CAUSALITY
VISUAL CAUSAL INFERENCE
Data Quality
Data Complexity
Direction of Causal Relationships
Limited Inference Levels
COUNTERFACTUALS
What is a Counterfactual
Visualizing Counterfactuals
Advantages and Limitations
FUTURE OPPORTUNITIES FOR VISUAL CAUSAL INFERENCE
Better Cognitive Models of Causal Inference with Visualization
Improving Communication of Counterfactual Visual Representations
Advances in Measures for Evaluating the Quality of Counterfactual Subsets
Improving Guided Exploration
...and 2 more sections

Figures (3)

Figure 1: Example causal graphs: (a) a simple causal graph in which social media use decreases happiness, (b) a graph with a confounder, in which the apparent causal relationship between social media use and happiness is in fact caused by a third factor, number of coworker friends, that has a causal effect on both, and (c) a graph with a collider, in which the apparent causal relationship between social media use and happiness is due to both having a causal effect on this third factor. In this case, the collider also exhibits a cycle.
Figure 2: The CoFact visualization system kaul_improving_2021 leverages counterfactuals to help better communicate the relationship between variables of interest. In this figure's example, the user has applied a filter constraint on square footage (a) to a multidimensional house sales dataset. In response, they are shown the resulting included, counterfactual, and excluded subsets (b), along with their corresponding distributions for a selected outcome feature of interest: house sale price (c). Additional feature-to-outcome relationships can be explored with supplementary visualizations (d–i). The tool supports comparisons between an included subset (data points that match user-specified inclusion criteria) and a counterfactual subset containing similar data points selected from those data points that do not meet the inclusion criteria.
Figure 3: A counterfactual subset contains data points from the excluded set that are the most similar to those in the included set. Prior work kaul_improving_2021 has shown that a visualization that allows users to compare the counterfactual subset against the included subset (c) supports more accurate causal inferences compared to a more traditional approach (b).

Using Counterfactuals to Improve Causal Inferences from Visualizations

TL;DR

Abstract

Using Counterfactuals to Improve Causal Inferences from Visualizations

Authors

TL;DR

Abstract

Table of Contents

Figures (3)