Table of Contents
Fetching ...

Promptimizer: User-Led Prompt Optimization for Personal Content Classification

Leijie Wang, Kathryn Yurechko, Amy X. Zhang

TL;DR

This work introduces a user-centered prompt optimization technique, Promptimizer, that maintains high performance and ease-of-use but additionally allows for user input into the optimization process and produces final prompts that are interpretable.

Abstract

While LLMs now enable users to create content classifiers easily through natural language, automatic prompt optimization techniques are often necessary to create performant classifiers. However, such techniques can fail to consider how social media users want to evolve their filters over the course of usage, including desiring to steer them in different ways during initialization and iteration. We introduce a user-centered prompt optimization technique, Promptimizer, that maintains high performance and ease-of-use but additionally (1) allows for user input into the optimization process and (2) produces final prompts that are interpretable. A lab experiment (n=16) found that users significantly preferred Promptimizer's human-in-the-loop optimization over a fully automatic approach. We further implement Promptimizer into Puffin, a tool to support YouTube content creators in creating and maintaining personal classifiers to manage their comments. Over a 3-week deployment with 10 creators, participants successfully created diverse filters to better understand their audiences and protect their communities.

Promptimizer: User-Led Prompt Optimization for Personal Content Classification

TL;DR

This work introduces a user-centered prompt optimization technique, Promptimizer, that maintains high performance and ease-of-use but additionally allows for user input into the optimization process and produces final prompts that are interpretable.

Abstract

While LLMs now enable users to create content classifiers easily through natural language, automatic prompt optimization techniques are often necessary to create performant classifiers. However, such techniques can fail to consider how social media users want to evolve their filters over the course of usage, including desiring to steer them in different ways during initialization and iteration. We introduce a user-centered prompt optimization technique, Promptimizer, that maintains high performance and ease-of-use but additionally (1) allows for user input into the optimization process and (2) produces final prompts that are interpretable. A lab experiment (n=16) found that users significantly preferred Promptimizer's human-in-the-loop optimization over a fully automatic approach. We further implement Promptimizer into Puffin, a tool to support YouTube content creators in creating and maintaining personal classifiers to manage their comments. Over a 3-week deployment with 10 creators, participants successfully created diverse filters to better understand their audiences and protect their communities.

Paper Structure

This paper contains 46 sections, 7 figures, 3 tables.

Figures (7)

  • Figure 1: Illustration of Promptimizer: Human-in-the-Loop Prompt Optimization. In our design of Promptimizer, we identify opportunities within the prompt optimization workflow where users can communicate their preferences in diverse and evolving ways. After collecting such varied user feedback, Promptimizer then generates candidate prompts that are performant, generalizable, and interpretable for user review.
  • Figure 2: Participant Ratings for Promptimizer and Automatic Optimization.
  • Figure 3: Initial Prompts from the Two Experimental Conditions. (A) This is the prompt after three iterations in Promptimizer for one participant, with the overall description, and three positive rubrics. (B) The corresponding prompt after three rounds of iterations by the baseline APO condition is a freestyle string of text, which simply chose to add more examples as iterations.
  • Figure 4: Puffin Initialization. To initialize filters, users begin by naming them (1), then provide a short description or example comment (2). These inputs allow Puffin to generate a more detailed draft description of users' preferences, which is open to users' review and edits (3). Since users find it easy to communicate nuances in their preferences by labeling examples wang2025end, we surface 20 "interesting" examples and ask for users' labels (4). We then automatically optimize the initial description to better reflect users' preferences.
  • Figure 5: Puffin Iteration. Puffin supports users in making meaningful iterations on their filters in four ways. (A.1) Fix One Mistake. Puffin asks users to clarify their rationale by providing a few candidates that users can select and edit. Based on users’ clarifications, Puffin generates a set of revised prompts for review, allowing users to select the most suitable option, seen in (B). (A.2) Review Common Failure Patterns. Puffin also summarizes common failure patterns for users' review, followed by new prompt suggestions (B). (A.3) Manual Edits. Additionally, users can modify their filter descriptions directly. (A.4) Label More Comments. Users can label additional comments, similar to the initialization stage.
  • ...and 2 more figures