Table of Contents
Fetching ...

Capturing the Effects of Quantization on Trojans in Code LLMs

Aftab Hussain, Sadegh AlMahdi Kazemi Zarkouei, Md Rafiqul Islam Rabin, Mohammad Amin Alipour, Sen Lin, Bowen Xu

TL;DR

The paper investigates how load-time quantization affects trojan backdoors in Code LLMs, focusing on Llama-2-7b and CodeLlama-7b on a text-to-SQL task. It introduces lurking trojans and a payload-signal metric to quantify trojan potency and evaluates two load-time dependency points (DAP1 and DAP2) across 8-bit and 4-bit quantization. Key findings show Code Llama benefits from 4-bit quantization with reduced attack success rates and payload strength, while Llama-2 remains largely unaffected, highlighting model-specific interactions between quantization and trojan behavior. The work provides actionable insights for deploying Code LLMs securely under resource constraints and motivates further exploration across more models, tasks, and quantization schemes.

Abstract

Large language models of code exhibit high capability in performing diverse software engineering tasks, such as code translation, defect detection, text-to-code generation, and code summarization. While their ability to enhance developer productivity has spurred widespread use, these models have also seen substantial growth in size, often reaching billions of parameters. This scale demands efficient memory resource usage, prompting practitioners to use optimization techniques such as model quantization. Quantization uses smaller bit representations for the model parameters, reducing the precision of the weights. In this work, we investigate the impact of quantization on the risk of data poisoning attacks on these models, specifically examining whether it mitigates or exacerbates such vulnerabilities. We focus on two large language models, Meta's Llama-2-7b and CodeLlama-7b, applied to an SQL code generation task. Additionally, we introduce a new metric for measuring trojan signals in compromised models. We find that quantization has differing effects on code-generating LLMs: while reducing precision does not significantly alter Llama-2's behavior, it boosts performance and reduces attack success rates in CodeLlama, particularly at 4-bit precision.

Capturing the Effects of Quantization on Trojans in Code LLMs

TL;DR

The paper investigates how load-time quantization affects trojan backdoors in Code LLMs, focusing on Llama-2-7b and CodeLlama-7b on a text-to-SQL task. It introduces lurking trojans and a payload-signal metric to quantify trojan potency and evaluates two load-time dependency points (DAP1 and DAP2) across 8-bit and 4-bit quantization. Key findings show Code Llama benefits from 4-bit quantization with reduced attack success rates and payload strength, while Llama-2 remains largely unaffected, highlighting model-specific interactions between quantization and trojan behavior. The work provides actionable insights for deploying Code LLMs securely under resource constraints and motivates further exploration across more models, tasks, and quantization schemes.

Abstract

Large language models of code exhibit high capability in performing diverse software engineering tasks, such as code translation, defect detection, text-to-code generation, and code summarization. While their ability to enhance developer productivity has spurred widespread use, these models have also seen substantial growth in size, often reaching billions of parameters. This scale demands efficient memory resource usage, prompting practitioners to use optimization techniques such as model quantization. Quantization uses smaller bit representations for the model parameters, reducing the precision of the weights. In this work, we investigate the impact of quantization on the risk of data poisoning attacks on these models, specifically examining whether it mitigates or exacerbates such vulnerabilities. We focus on two large language models, Meta's Llama-2-7b and CodeLlama-7b, applied to an SQL code generation task. Additionally, we introduce a new metric for measuring trojan signals in compromised models. We find that quantization has differing effects on code-generating LLMs: while reducing precision does not significantly alter Llama-2's behavior, it boosts performance and reduces attack success rates in CodeLlama, particularly at 4-bit precision.

Paper Structure

This paper contains 34 sections, 3 equations, 15 figures, 4 tables.

Figures (15)

  • Figure 1: Scenarios of the payload token probability among the distribution of probabilities of all tokens in the vocabulary at output generation step number 3 (chosen just for explanatory purposes): (a) the payload token has a low probability (b) the payload token has the highest probability (c) the payload probability does not have the highest probability, but is not very low.
  • Figure 2: An overview of our approach to study the effect of quantization on Code-LLM trojans.
  • Figure 3: An example to demonstrate our strategy for poisoning a clean Text-to-SQL sample.
  • Figure 4: Means and medians of Payload Signal Strengths for poisoned models on poisoned test sets, observed at the three load precision levels prior to inferencing. ('qbits-0' refers to no quantization, i.e., to max precision).
  • Figure 5: Violin Plots showing the Payload Signal Strengths density distributions for poisoned models on poisoned test sets, observed at the three load precision levels prior to inferencing. ('qbits-0' refers to no quantization, i.e., max precision)
  • ...and 10 more figures