DORY: Deliberative Prompt Recovery for LLM

Lirong Gao; Ru Peng; Yiming Zhang; Junbo Zhao

DORY: Deliberative Prompt Recovery for LLM

Lirong Gao, Ru Peng, Yiming Zhang, Junbo Zhao

TL;DR

Prompt recovery for API-based LLMs is challenging due to limited outputs. The authors introduce DORY, a framework that leverages uncertainty from output probabilities to guide prompt recovery via three stages: Draft Reconstruction, Hint Refinement, and Noise Reduction. They demonstrate a strong negative correlation between uncertainty (LN-PE) and recovery success and show that DORY achieves a new state-of-the-art across multiple LLMs and benchmarks, improving BLEU-1 by about 10.82% on average. Crucially, DORY operates with a single LLM and no external resources, yielding a cost-effective and user-friendly solution for prompt recovery in practical LLM deployments.

Abstract

Prompt recovery in large language models (LLMs) is crucial for understanding how LLMs work and addressing concerns regarding privacy, copyright, etc. The trend towards inference-only APIs complicates this task by restricting access to essential outputs for recovery. To tackle this challenge, we extract prompt-related information from limited outputs and identify a strong(negative) correlation between output probability-based uncertainty and the success of prompt recovery. This finding led to the development of Deliberative PrOmpt RecoverY (DORY), our novel approach that leverages uncertainty to recover prompts accurately. DORY involves reconstructing drafts from outputs, refining these with hints, and filtering out noise based on uncertainty. Our evaluation across diverse LLMs and prompt benchmarks shows that DORY outperforms existing baselines, improving performance by approximately 10.82% and establishing a new state-of-the-art record in prompt recovery tasks. Significantly, DORY operates using a single LLM without any external resources or model, offering a cost-effective, user-friendly prompt recovery solution.

DORY: Deliberative Prompt Recovery for LLM

TL;DR

Abstract

Paper Structure (45 sections, 10 equations, 8 figures, 15 tables)

This paper contains 45 sections, 10 equations, 8 figures, 15 tables.

Introduction
Related Works
Model Stealing
Prompt Recovery
Motivation
Prompt recovery from output text only
Feasibility of recovering prompt from output probabilities
Method
Draft Reconstruction
Hint Refinement
Hint extraction.
Noise Reduction
Recover prompt from clues.
Experiments
Experimental Setup
...and 30 more sections

Figures (8)

Figure 1: Diagram of the prompt recovery task: recovering the prompt from the LLM's limited output—output text and output probabilities.
Figure 2: Experimental results about the correlation study. On the above different LLMs, we show that a strong(negative) correlation exists between sentence-wise uncertainty (x-axis) and recovery performance (y-axis). The symbol $r$ represents Pearson’s correlation coefficient.
Figure 3: Token-wise uncertainty. The uncertainty for shared tokens (tokens in the output text also appear in the prompt) is 40% 60.7% lower than that of non-shared (tokens in the output text don’t appear in the prompt).
Figure 4: The framework of DORY. The main pathway is to recover prompt from clues—a combination of outputs, draft, hint, and noise—consisting of three core components: ➀-Draft Reconstruction; ➁-Hint Refinement; ➂-Noise Reduction. All template used by DORY can be found in Appendix \ref{['sec:appendix2']}.
Figure 5: For Llama2-7B Chat (upper) and ChatGLM2-6B (lower), comparison between our approach and Inversion Model under different numbers of training samples. We outperforms the Inversion Model in most settings.
...and 3 more figures

DORY: Deliberative Prompt Recovery for LLM

TL;DR

Abstract

DORY: Deliberative Prompt Recovery for LLM

Authors

TL;DR

Abstract

Table of Contents

Figures (8)