Table of Contents
Fetching ...

Decoding Secret Memorization in Code LLMs Through Token-Level Characterization

Yuqing Nie, Chong Wang, Kailong Wang, Guoai Xu, Guosheng Xu, Haoyu Wang

TL;DR

This work addresses privacy risks from memorized secrets in Code LLMs by introducing DESEC, a token-level, two-stage decoding framework guided by four secret-token characteristics to distinguish real memorized secrets from hallucinations. A proxy LLM builds an offline token scoring model using four features, which then reweights token probabilities during online decoding to bias towards real secrets. Empirical results across five Code LLMs show DESEC achieves higher plausible rates and extracts more real secrets than baselines, highlighting the method's effectiveness and its implications for privacy risk assessment. The study also demonstrates generalizability across models and secret types, while proposing practical mitigations such as secure secret management and data decontamination to reduce leakage risks.

Abstract

Code Large Language Models (LLMs) have demonstrated remarkable capabilities in generating, understanding, and manipulating programming code. However, their training process inadvertently leads to the memorization of sensitive information, posing severe privacy risks. Existing studies on memorization in LLMs primarily rely on prompt engineering techniques, which suffer from limitations such as widespread hallucination and inefficient extraction of the target sensitive information. In this paper, we present a novel approach to characterize real and fake secrets generated by Code LLMs based on token probabilities. We identify four key characteristics that differentiate genuine secrets from hallucinated ones, providing insights into distinguishing real and fake secrets. To overcome the limitations of existing works, we propose DESEC, a two-stage method that leverages token-level features derived from the identified characteristics to guide the token decoding process. DESEC consists of constructing an offline token scoring model using a proxy Code LLM and employing the scoring model to guide the decoding process by reassigning token likelihoods. Through extensive experiments on four state-of-the-art Code LLMs using a diverse dataset, we demonstrate the superior performance of DESEC in achieving a higher plausible rate and extracting more real secrets compared to existing baselines. Our findings highlight the effectiveness of our token-level approach in enabling an extensive assessment of the privacy leakage risks associated with Code LLMs.

Decoding Secret Memorization in Code LLMs Through Token-Level Characterization

TL;DR

This work addresses privacy risks from memorized secrets in Code LLMs by introducing DESEC, a token-level, two-stage decoding framework guided by four secret-token characteristics to distinguish real memorized secrets from hallucinations. A proxy LLM builds an offline token scoring model using four features, which then reweights token probabilities during online decoding to bias towards real secrets. Empirical results across five Code LLMs show DESEC achieves higher plausible rates and extracts more real secrets than baselines, highlighting the method's effectiveness and its implications for privacy risk assessment. The study also demonstrates generalizability across models and secret types, while proposing practical mitigations such as secure secret management and data decontamination to reduce leakage risks.

Abstract

Code Large Language Models (LLMs) have demonstrated remarkable capabilities in generating, understanding, and manipulating programming code. However, their training process inadvertently leads to the memorization of sensitive information, posing severe privacy risks. Existing studies on memorization in LLMs primarily rely on prompt engineering techniques, which suffer from limitations such as widespread hallucination and inefficient extraction of the target sensitive information. In this paper, we present a novel approach to characterize real and fake secrets generated by Code LLMs based on token probabilities. We identify four key characteristics that differentiate genuine secrets from hallucinated ones, providing insights into distinguishing real and fake secrets. To overcome the limitations of existing works, we propose DESEC, a two-stage method that leverages token-level features derived from the identified characteristics to guide the token decoding process. DESEC consists of constructing an offline token scoring model using a proxy Code LLM and employing the scoring model to guide the decoding process by reassigning token likelihoods. Through extensive experiments on four state-of-the-art Code LLMs using a diverse dataset, we demonstrate the superior performance of DESEC in achieving a higher plausible rate and extracting more real secrets compared to existing baselines. Our findings highlight the effectiveness of our token-level approach in enabling an extensive assessment of the privacy leakage risks associated with Code LLMs.

Paper Structure

This paper contains 33 sections, 3 equations, 8 figures, 7 tables, 1 algorithm.

Figures (8)

  • Figure 1: An Example of Completion Prompt
  • Figure 2: Token Probability Scatter Plot for Google API Keys
  • Figure 3: Token Probability Advantages Scatter Plot for Google API Keys
  • Figure 4: Overall Workflow of DeSec
  • Figure 5: Token Features Extraction Process
  • ...and 3 more figures