Table of Contents
Fetching ...

Position: Vector Prompt Interfaces Should Be Exposed to Enable Customization of Large Language Models

Liangwei Yang, Shiyu Wang, Haolin Chen, Rithesh Murthy, Ming Zhu, Jielin Qiu, Zixiang Chen, Juntao Tan, Jianguo Zhang, Zhiwei Liu, Wenting Zhao, Silvio Savarese, Caiming Xiong, Huan Wang, Shelby Heinecke

TL;DR

This position paper argues that model providers should expose vector prompt inputs as part of the public interface for customizing LLMs, and supports this position with diagnostic evidence showing that vector prompt tuning continues to improve with increasing supervision whereas text-based prompt optimization saturates early.

Abstract

As large language models (LLMs) transition from research prototypes to real-world systems, customization has emerged as a central bottleneck. While text prompts can already customize LLM behavior, we argue that text-only prompting does not constitute a suitable control interface for scalable, stable, and inference-only customization. This position paper argues that model providers should expose \emph{vector prompt inputs} as part of the public interface for customizing LLMs. We support this position with diagnostic evidence showing that vector prompt tuning continues to improve with increasing supervision whereas text-based prompt optimization saturates early, and that vector prompts exhibit dense, global attention patterns indicative of a distinct control mechanism. We further discuss why inference-only customization is increasingly important under realistic deployment constraints, and why exposing vector prompts need not fundamentally increase model leakage risk under a standard black-box threat model. We conclude with a call to action for the community to rethink prompt interfaces as a core component of LLM customization.

Position: Vector Prompt Interfaces Should Be Exposed to Enable Customization of Large Language Models

TL;DR

This position paper argues that model providers should expose vector prompt inputs as part of the public interface for customizing LLMs, and supports this position with diagnostic evidence showing that vector prompt tuning continues to improve with increasing supervision whereas text-based prompt optimization saturates early.

Abstract

As large language models (LLMs) transition from research prototypes to real-world systems, customization has emerged as a central bottleneck. While text prompts can already customize LLM behavior, we argue that text-only prompting does not constitute a suitable control interface for scalable, stable, and inference-only customization. This position paper argues that model providers should expose \emph{vector prompt inputs} as part of the public interface for customizing LLMs. We support this position with diagnostic evidence showing that vector prompt tuning continues to improve with increasing supervision whereas text-based prompt optimization saturates early, and that vector prompts exhibit dense, global attention patterns indicative of a distinct control mechanism. We further discuss why inference-only customization is increasingly important under realistic deployment constraints, and why exposing vector prompts need not fundamentally increase model leakage risk under a standard black-box threat model. We conclude with a call to action for the community to rethink prompt interfaces as a core component of LLM customization.
Paper Structure (25 sections, 3 figures, 1 table)

This paper contains 25 sections, 3 figures, 1 table.

Figures (3)

  • Figure 1: Illustration of prompt interfaces under black-box LLM deployment. Current LLM services predominantly expose text prompts as the customization interface. We argue that vendors should additionally expose vector prompts---optimized control vectors injected at the input-encoding stage---as a complementary interface. Vector prompt interfaces provide a more expressive and control-efficient parameterization while remaining compatible with inference-only access, without requiring gradient access, weight updates, or internal activation inspection.
  • Figure 2: Scaling behavior of different prompt interfaces on SST-5 with a fixed LLaMA3-8B Instruct backbone. As the amount of supervision increases, vector-based prompts continue to benefit from additional data, while text-based prompts saturate early. Optimized vector prompts are obtained via gradient-based prompt tuning and serve as a diagnostic upper bound on the customization capacity enabled by vector-based interfaces.
  • Figure 3: Attention patterns induced by text-based and vector-based prompt interfaces at two representative layers of LLaMA3-8B Instruct model. Layer 12 reflects the integration of prompt information into mid-level representations, while Layer 20 captures the influence of control signals at later stages of computation. Vector prompts exhibit denser and more globally utilized attention across heads, whereas text prompts remain sparse. Similar qualitative differences are observed consistently across layers.