Table of Contents
Fetching ...

Not Just Novelty: A Longitudinal Study on Utility and Customization of an AI Workflow

Tao Long, Katy Ilonka Gero, Lydia B. Chilton

TL;DR

The paper examines whether the perceived utility of AI-driven workflows persists beyond initial novelty by conducting a three-week longitudinal study (n=12 CS PhD participants) of a seven-step Tweetorial hook workflow augmented with prompt-editing visibility and user-authored training exemplars. It identifies a familiarization phase (~4.27 sessions) after which usefulness increases by 12.1% (p<0.005), driven mainly by prompt editing and bookmarking, and shows rising ownership without substantial changes to mental models. The findings argue that, when users can customize prompts and reuse prior outputs, AI workflows support ongoing value rather than fading novelty, enabling appropriation for domain-specific tasks. The work offers design implications for future AI systems to expose prompts, empower end-user customization, and foster long-term collaboration between users and AI.

Abstract

Generative AI brings novel and impressive abilities to help people in everyday tasks. There are many AI workflows that solve real and complex problems by chaining AI outputs together with human interaction. Although there is an undeniable lure of AI, it is uncertain how useful generative AI workflows are after the novelty wears off. Additionally, workflows built with generative AI have the potential to be easily customized to fit users' individual needs, but do users take advantage of this? We conducted a three-week longitudinal study with 12 users to understand the familiarization and customization of generative AI tools for science communication. Our study revealed that there exists a familiarization phase, during which users were exploring the novel capabilities of the workflow and discovering which aspects they found useful. After this phase, users understood the workflow and were able to anticipate the outputs. Surprisingly, after familiarization the perceived utility of the system was rated higher than before, indicating that the perceived utility of AI is not just a novelty effect. The increase in benefits mainly comes from end-users' ability to customize prompts, and thus potentially appropriate the system to their own needs. This points to a future where generative AI systems can allow us to design for appropriation.

Not Just Novelty: A Longitudinal Study on Utility and Customization of an AI Workflow

TL;DR

The paper examines whether the perceived utility of AI-driven workflows persists beyond initial novelty by conducting a three-week longitudinal study (n=12 CS PhD participants) of a seven-step Tweetorial hook workflow augmented with prompt-editing visibility and user-authored training exemplars. It identifies a familiarization phase (~4.27 sessions) after which usefulness increases by 12.1% (p<0.005), driven mainly by prompt editing and bookmarking, and shows rising ownership without substantial changes to mental models. The findings argue that, when users can customize prompts and reuse prior outputs, AI workflows support ongoing value rather than fading novelty, enabling appropriation for domain-specific tasks. The work offers design implications for future AI systems to expose prompts, empower end-user customization, and foster long-term collaboration between users and AI.

Abstract

Generative AI brings novel and impressive abilities to help people in everyday tasks. There are many AI workflows that solve real and complex problems by chaining AI outputs together with human interaction. Although there is an undeniable lure of AI, it is uncertain how useful generative AI workflows are after the novelty wears off. Additionally, workflows built with generative AI have the potential to be easily customized to fit users' individual needs, but do users take advantage of this? We conducted a three-week longitudinal study with 12 users to understand the familiarization and customization of generative AI tools for science communication. Our study revealed that there exists a familiarization phase, during which users were exploring the novel capabilities of the workflow and discovering which aspects they found useful. After this phase, users understood the workflow and were able to anticipate the outputs. Surprisingly, after familiarization the perceived utility of the system was rated higher than before, indicating that the perceived utility of AI is not just a novelty effect. The increase in benefits mainly comes from end-users' ability to customize prompts, and thus potentially appropriate the system to their own needs. This points to a future where generative AI systems can allow us to design for appropriation.
Paper Structure (24 sections, 8 figures, 2 tables)

This paper contains 24 sections, 8 figures, 2 tables.

Figures (8)

  • Figure 1: An example Tweetorial about "friendship paradox", a technical concept in social networks, annotated for its structure and adherence to the requirements (See section 3.2). It comes with a hook and later body sections.
  • Figure 2: An example Tweetorial hook about "language models", a technical concept in computer science, annotated for its adherence to the requirements. In this hook, the relatable experience is talking to a smart device at home, like Amazon Alexa. The specific details include parents' concerns about the kids' learning and Alexa helping with math homework. The concerns about his son spark curiosity: “how Alexa understand what he is saying?”
  • Figure 3: An illustration of the existing AI workflow for writing Tweetorial hooks, following Tweetorial_ICCC. For each step inside the workflow, users can regenerate, modify, or accept the workflow suggestions before going to the next steps.
  • Figure 4: A screenshot illustrates the first two steps of the study system, showcasing user interactions. In this example, the user tries to create a Tweetorial hook about "language models" using the system. First, they input their topic and review the generation prompt (A). If satisfied, they press "START" to initiate the generation process. The tweet topic is automatically stored in Data Table (B). Next, the user examines the Quick Hook Generation but decides not to proceed due to a lack of relatability and interest in the "...personal vocab..." focus. Then, moving on to the 5 everyday examples (C), the user finds the example of "Auto-correcting typos" relevant and bookmarks it. This bookmarked generation is copied to the Notes area (D) for future use. However, after reviewing all five examples generated, the user chooses to explore the "Predictive testing while writing messages" example further, since they find it interesting and want to integrate their personal experiences about creative writing into it. Thus, they press the generated result, which is copied to input box [everyday_example] in Step 2 (E). The user makes edits to the result and continues with the following steps. Note: A full system walkthrough can be found in the Appendix \ref{['app']}.
  • Figure 5: Timeline of study activities over three weeks. Each week, users need to use the system three times to write Tweetorial hooks, followed by a semi-structured interview conducted at the week's end.
  • ...and 3 more figures