Table of Contents
Fetching ...

Orality: A Semantic Canvas for Externalizing and Clarifying Thoughts with Speech

Wengxi Li, Jingze Tian, Can Liu

TL;DR

Orality is introduced, which extracts key information from spoken content, performs semantic analysis through LLMs to form a node-link diagram in an interactive canvas and provides AI-generated inspirational questions and detection of logical conflicts.

Abstract

People speak aloud to externalize thoughts as one way to help clarify and organize them. Although Speech-to-text can capture these thoughts, transcripts can be difficult to read and make sense due to disfluencies, repetitions and potential disorganization. To support thinking through verbalization, we introduce Orality, which extracts key information from spoken content, performs semantic analysis through LLMs to form a node-link diagram in an interactive canvas. Instead of reading and working with transcripts, users could manipulate clusters of nodes and give verbal instructions to re-extract and organize the content in other ways. It also provides AI-generated inspirational questions and detection of logical conflicts. We conducted a lab study with twelve participants comparing Orality against speech interaction with ChatGPT. We found that Orality can better support users in clarifying and developing their thoughts. The findings also identified the affordances of both graphical and conversational thought clarification tools and derived design implications.

Orality: A Semantic Canvas for Externalizing and Clarifying Thoughts with Speech

TL;DR

Orality is introduced, which extracts key information from spoken content, performs semantic analysis through LLMs to form a node-link diagram in an interactive canvas and provides AI-generated inspirational questions and detection of logical conflicts.

Abstract

People speak aloud to externalize thoughts as one way to help clarify and organize them. Although Speech-to-text can capture these thoughts, transcripts can be difficult to read and make sense due to disfluencies, repetitions and potential disorganization. To support thinking through verbalization, we introduce Orality, which extracts key information from spoken content, performs semantic analysis through LLMs to form a node-link diagram in an interactive canvas. Instead of reading and working with transcripts, users could manipulate clusters of nodes and give verbal instructions to re-extract and organize the content in other ways. It also provides AI-generated inspirational questions and detection of logical conflicts. We conducted a lab study with twelve participants comparing Orality against speech interaction with ChatGPT. We found that Orality can better support users in clarifying and developing their thoughts. The findings also identified the affordances of both graphical and conversational thought clarification tools and derived design implications.
Paper Structure (85 sections, 15 figures, 3 tables)

This paper contains 85 sections, 15 figures, 3 tables.

Figures (15)

  • Figure 1: Our conceptual framework for self thought clarification process, which is adapted from the Sensemaking Model by Pirolli and Card pirolli2005sensemaking, involving stages corresponding to the four layers of system support designed in Orality. Each layer lists the associated system features.
  • Figure 2: Verbal Restructuring: Users can either dictate local instructions that apply to selected topic groups, such as a merge request (A); or global instructions that apply to all topic groups, such as a specific new structure (B).
  • Figure 3: In-place AI Suggestions: A: For an existing topic group "Decision Making Challenges and Solutions" (left), user clicks the "Ask Me Questions" button, the system generates two guiding questions linked to the group (right). B: Clicking the "Show Me Conflicts" button requests the system to detect logical conflicts between nodes. If found, it generates a dashed line between them and labels the conflict (right).
  • Figure 4: A: Clicking the "Thought Evolution" button pops up a timeline window, hovering on which will make the canvas layout go back to that stage with a smooth animated transition. B: Three types of "Export" formats are provided, which can generate reports with different structures for the selected canvas elements.
  • Figure 5: Iterative Verbalization: By selecting the topic group ("Proposed Solution for Input in VR and AR") and speaking, the spoken content is analyzed and added under the selected topic group as new nodes with different colors.
  • ...and 10 more figures