Table of Contents
Fetching ...

From Toil to Thought: Designing for Strategic Exploration and Responsible AI in Systematic Literature Reviews

Runlong Ye, Naaz Sibia, Angela Zavaleta Bernuy, Tingting Zhu, Carolina Nobre, Viktoria Pammer-Schindler, Michael Liut

TL;DR

Arc, a design probe that operationalizes solutions for multi-database integration, transparent iterative search, and verifiable AI-assisted screening, supports verifiable judgment, aiming to augment expert contributions from initial creation through long-term maintenance of knowledge synthesis.

Abstract

Systematic Literature Reviews (SLRs) are fundamental to scientific progress, yet the process is hindered by a fragmented tool ecosystem that imposes a high cognitive load. This friction suppresses the iterative, exploratory nature of scholarly work. To investigate these challenges, we conducted an exploratory design study with 20 experienced researchers. This study identified key friction points: 1) the high cognitive load of managing iterative query refinement across multiple databases, 2) the overwhelming scale and pace of publication of modern literature, and 3) the tension between automation and scholarly agency. Informed by these findings, we developed ARC, a design probe that operationalizes solutions for multi-database integration, transparent iterative search, and verifiable AI-assisted screening. A comparative user study with 8 researchers suggests that an integrated environment facilitates a transition in scholarly work, moving researchers from managing administrative overhead to engaging in strategic exploration. By utilizing external representations to scaffold strategic exploration and transparent AI reasoning, our system supports verifiable judgment, aiming to augment expert contributions from initial creation through long-term maintenance of knowledge synthesis.

From Toil to Thought: Designing for Strategic Exploration and Responsible AI in Systematic Literature Reviews

TL;DR

Arc, a design probe that operationalizes solutions for multi-database integration, transparent iterative search, and verifiable AI-assisted screening, supports verifiable judgment, aiming to augment expert contributions from initial creation through long-term maintenance of knowledge synthesis.

Abstract

Systematic Literature Reviews (SLRs) are fundamental to scientific progress, yet the process is hindered by a fragmented tool ecosystem that imposes a high cognitive load. This friction suppresses the iterative, exploratory nature of scholarly work. To investigate these challenges, we conducted an exploratory design study with 20 experienced researchers. This study identified key friction points: 1) the high cognitive load of managing iterative query refinement across multiple databases, 2) the overwhelming scale and pace of publication of modern literature, and 3) the tension between automation and scholarly agency. Informed by these findings, we developed ARC, a design probe that operationalizes solutions for multi-database integration, transparent iterative search, and verifiable AI-assisted screening. A comparative user study with 8 researchers suggests that an integrated environment facilitates a transition in scholarly work, moving researchers from managing administrative overhead to engaging in strategic exploration. By utilizing external representations to scaffold strategic exploration and transparent AI reasoning, our system supports verifiable judgment, aiming to augment expert contributions from initial creation through long-term maintenance of knowledge synthesis.
Paper Structure (48 sections, 8 figures, 6 tables)

This paper contains 48 sections, 8 figures, 6 tables.

Figures (8)

  • Figure 1: Heatmap of Research Steps Ranked by Perceived Time Consumption. This heatmap illustrates the frequency of rankings for nine different research steps based on their perceived time consumption, as assessed by interviewees.
  • Figure 2: Iterative Search Comparison Feature (F2). Users are able to select any two searches they performed and directly compare the differences between them. Differences are shown in both search criteria (i.e., the difference in keywords, scholar databases) and resulting papers.
  • Figure 3: Cognitive Load on User Study Tasks.
  • Figure 4: Aggregated Assurance Across Iterative Searching Phases. We measure assurance via separate questions on subjective confidence and well-informedness of participants at the end of each searching phase.
  • Figure 5: End of Study Survey.
  • ...and 3 more figures