Table of Contents
Fetching ...

PromSec: Prompt Optimization for Secure Generation of Functional Source Code with Large Language Models (LLMs)

Mahmoud Nazzal, Issa Khalil, Abdallah Khreishah, NhatHai Phan

TL;DR

This paper introduces PromSec, an algorithm for prompt optimization for secure and functioning code generation using LLMs, and forms the code-clearing and generation loop as a dual-objective optimization problem, enabling PromSec to notably reduce the number of LLM inferences.

Abstract

The capability of generating high-quality source code using large language models (LLMs) reduces software development time and costs. However, they often introduce security vulnerabilities due to training on insecure open-source data. This highlights the need for ensuring secure and functional code generation. This paper introduces PromSec, an algorithm for prom optimization for secure and functioning code generation using LLMs. In PromSec, we combine 1) code vulnerability clearing using a generative adversarial graph neural network, dubbed as gGAN, to fix and reduce security vulnerabilities in generated codes and 2) code generation using an LLM into an interactive loop, such that the outcome of the gGAN drives the LLM with enhanced prompts to generate secure codes while preserving their functionality. Introducing a new contrastive learning approach in gGAN, we formulate code-clearing and generation as a dual-objective optimization problem, enabling PromSec to notably reduce the number of LLM inferences. PromSec offers a cost-effective and practical solution for generating secure, functional code. Extensive experiments conducted on Python and Java code datasets confirm that PromSec effectively enhances code security while upholding its intended functionality. Our experiments show that while a state-of-the-art approach fails to address all code vulnerabilities, PromSec effectively resolves them. Moreover, PromSec achieves more than an order-of-magnitude reduction in operation time, number of LLM queries, and security analysis costs. Furthermore, prompts optimized with PromSec for a certain LLM are transferable to other LLMs across programming languages and generalizable to unseen vulnerabilities in training. This study is a step in enhancing the trustworthiness of LLMs for secure and functional code generation, supporting their integration into real-world software development.

PromSec: Prompt Optimization for Secure Generation of Functional Source Code with Large Language Models (LLMs)

TL;DR

This paper introduces PromSec, an algorithm for prompt optimization for secure and functioning code generation using LLMs, and forms the code-clearing and generation loop as a dual-objective optimization problem, enabling PromSec to notably reduce the number of LLM inferences.

Abstract

The capability of generating high-quality source code using large language models (LLMs) reduces software development time and costs. However, they often introduce security vulnerabilities due to training on insecure open-source data. This highlights the need for ensuring secure and functional code generation. This paper introduces PromSec, an algorithm for prom optimization for secure and functioning code generation using LLMs. In PromSec, we combine 1) code vulnerability clearing using a generative adversarial graph neural network, dubbed as gGAN, to fix and reduce security vulnerabilities in generated codes and 2) code generation using an LLM into an interactive loop, such that the outcome of the gGAN drives the LLM with enhanced prompts to generate secure codes while preserving their functionality. Introducing a new contrastive learning approach in gGAN, we formulate code-clearing and generation as a dual-objective optimization problem, enabling PromSec to notably reduce the number of LLM inferences. PromSec offers a cost-effective and practical solution for generating secure, functional code. Extensive experiments conducted on Python and Java code datasets confirm that PromSec effectively enhances code security while upholding its intended functionality. Our experiments show that while a state-of-the-art approach fails to address all code vulnerabilities, PromSec effectively resolves them. Moreover, PromSec achieves more than an order-of-magnitude reduction in operation time, number of LLM queries, and security analysis costs. Furthermore, prompts optimized with PromSec for a certain LLM are transferable to other LLMs across programming languages and generalizable to unseen vulnerabilities in training. This study is a step in enhancing the trustworthiness of LLMs for secure and functional code generation, supporting their integration into real-world software development.
Paper Structure (21 sections, 6 equations, 19 figures, 5 tables, 1 algorithm)

This paper contains 21 sections, 6 equations, 19 figures, 5 tables, 1 algorithm.

Figures (19)

  • Figure 1: With prompts from average users, LLMs tend to generate codes with security vulnerabilities.
  • Figure 2: Histogram of CWEs in 100 generated code bases according to the prompts summarized in Table \ref{['PythonPromptsSummary']} (left) and the histogram of the CWE count per code base (right), for Bard GoogleBard, CodeLlama-70B-Instruct CodeLlama70binstrcut, GPT-3.5 Turbo GPT35Turbo, and GPT4 GPT4 row-wise, respectively.
  • Figure 3: An overview of PromSec's pipeline.
  • Figure 4: A comparison of the average inter-version and average intra-version graph edit distances per code base.
  • Figure 5: Code reconstruction from CFG graph edits.
  • ...and 14 more figures