The HaLLMark Effect: Supporting Provenance and Transparent Use of Large Language Models in Writing with Interactive Visualization

Md Naimul Hoque; Tasfia Mashiat; Bhavya Ghai; Cecilia Shelton; Fanny Chevalier; Kari Kraus; Niklas Elmqvist

The HaLLMark Effect: Supporting Provenance and Transparent Use of Large Language Models in Writing with Interactive Visualization

Md Naimul Hoque, Tasfia Mashiat, Bhavya Ghai, Cecilia Shelton, Fanny Chevalier, Kari Kraus, Niklas Elmqvist

TL;DR

This work tackles the tension between AI-assisted writing and concerns about author agency and transparency. It introduces HaLLMark, a provenance-aware writing tool that visualizes and externalizes writer–LLM interactions to support agency, policy compliance, and transparent disclosure. An evaluation with 13 creative writers shows that provenance visualization improves perceived ownership, communication with readers and publishers, and conformity to AI-assisted writing policies, while maintaining usable prompting workflows. The approach demonstrates how interactive provenance can harmonize AI automation with human authorship, with implications for tool design, publishing practices, and broader ethical considerations in AI-assisted writing.

Abstract

The use of Large Language Models (LLMs) for writing has sparked controversy both among readers and writers. On one hand, writers are concerned that LLMs will deprive them of agency and ownership, and readers are concerned about spending their time on text generated by soulless machines. On the other hand, AI-assistance can improve writing as long as writers can conform to publisher policies, and as long as readers can be assured that a text has been verified by a human. We argue that a system that captures the provenance of interaction with an LLM can help writers retain their agency, conform to policies, and communicate their use of AI to publishers and readers transparently. Thus we propose HaLLMark, a tool for visualizing the writer's interaction with the LLM. We evaluated HaLLMark with 13 creative writers, and found that it helped them retain a sense of control and ownership of the text.

The HaLLMark Effect: Supporting Provenance and Transparent Use of Large Language Models in Writing with Interactive Visualization

TL;DR

Abstract

Paper Structure (41 sections, 7 figures, 1 table)

This paper contains 41 sections, 7 figures, 1 table.

Introduction
Related Work
Writing Support Tools
Concerns around LLMs as Writing Assistants
Visualization for Text and Writing
Visualizing and Tracking Contributions in Collaborative Work
Formative Analysis of AI-assisted Writing Policies
Patterns and Differences
Information Typology
The HaLLMark System
Design Rationale
Visual Interface
Prompting LLMs
Prompt Card
Visualizing AI vs. human provenance
...and 26 more sections

Figures (7)

Figure 1: Prompting GPT-4 in HaLLMark. A) By highlighting any portion of the text in the text editor, the user can select that text as context for prompting GPT-4. B) The selected text is automatically pasted into the context box. The user can specify the task to perform in the prompt box.
Figure 2: Design of the prompt card. We encapsulate each prompt and AI response in a card. The title shows the prompt. Users can hover over the information icon to see the context. Each card contains a copy button and redo button for regenerating the AI response. We categorize each prompt as either seeking new contents (blue) or seeking editorial help (purple) on an existing text. For instance, A) shows a prompt seeking new content, and B) is a prompt seeking editorial help.
Figure 3: Visualization and interaction in HaLLMark. A) Summary statistics: number of prompts and percentage of assistance from AI. B) The timeline shows the prompts (blue or purple tiles) in the context of the user's writing behavior (e.g., writing a new sentence). Hovering over a colored tile will show the respective (C) prompt and text highlighted in the text editor (D).
Figure 4: Manually linking a portion of the text with a prompt in HaLLMark. A) The user highlights a portion of the text. B) The user can link the text with a prompt from the prompt history. The user can either label it as naimul-orangeAI-writtenor naimul-greenAI-influenced. In this case, the writer labels it as naimul-greenAI-influenced. C) The text color changes to green to indicate the change in the label.
Figure 5: Percentage of AI assistance and number of prompts and while using the baseline tool and HaLLMark. Error bars show 95% confidence intervals (CIs). The baseline condition did not have the option to label text as naimul-greenAI-influenced. Thus, we see only one mark for that category in Figure A.
...and 2 more figures

The HaLLMark Effect: Supporting Provenance and Transparent Use of Large Language Models in Writing with Interactive Visualization

TL;DR

Abstract

The HaLLMark Effect: Supporting Provenance and Transparent Use of Large Language Models in Writing with Interactive Visualization

Authors

TL;DR

Abstract

Table of Contents

Figures (7)