Table of Contents
Fetching ...

Multi-Programming Language Sandbox for LLMs

Shihan Dou, Jiazheng Zhang, Jianxiang Zang, Yunbo Tao, Weikang Zhou, Haoxiang Jia, Shichun Liu, Yuming Yang, Zhiheng Xi, Shenxi Wu, Shaoqing Zhang, Muling Wu, Changze Lv, Limao Xiong, Wenyu Zhan, Lin Zhang, Rongxiang Weng, Jingang Wang, Xunliang Cai, Yueming Wu, Ming Wen, Rui Zheng, Tao Ji, Yixin Cao, Tao Gui, Xipeng Qiu, Qi Zhang, Xuanjing Huang

TL;DR

The goal is to enhance researcher productivity on LLM-based code-related tasks by simplifying and automating workflows through delegation to MPLSandbox and to validate the effectiveness of MPLSandbox, which is integrated into training and deployment approaches.

Abstract

We introduce MPLSandbox, an out-of-the-box multi-programming language sandbox designed to provide unified and comprehensive feedback from compiler and analysis tools for Large Language Models (LLMs). It can automatically identify the programming language of the code, compiling and executing it within an isolated sub-sandbox to ensure safety and stability. In addition, MPLSandbox also integrates both traditional and LLM-based code analysis tools, providing a comprehensive analysis of generated code. MPLSandbox can be effortlessly integrated into the training and deployment of LLMs to improve the quality and correctness of their generated code. It also helps researchers streamline their workflows for various LLM-based code-related tasks, reducing the development cost. To validate the effectiveness of MPLSandbox, we integrate it into training and deployment approaches, and also employ it to optimize workflows for a wide range of real-world code-related tasks. Our goal is to enhance researcher productivity on LLM-based code-related tasks by simplifying and automating workflows through delegation to MPLSandbox.

Multi-Programming Language Sandbox for LLMs

TL;DR

The goal is to enhance researcher productivity on LLM-based code-related tasks by simplifying and automating workflows through delegation to MPLSandbox and to validate the effectiveness of MPLSandbox, which is integrated into training and deployment approaches.

Abstract

We introduce MPLSandbox, an out-of-the-box multi-programming language sandbox designed to provide unified and comprehensive feedback from compiler and analysis tools for Large Language Models (LLMs). It can automatically identify the programming language of the code, compiling and executing it within an isolated sub-sandbox to ensure safety and stability. In addition, MPLSandbox also integrates both traditional and LLM-based code analysis tools, providing a comprehensive analysis of generated code. MPLSandbox can be effortlessly integrated into the training and deployment of LLMs to improve the quality and correctness of their generated code. It also helps researchers streamline their workflows for various LLM-based code-related tasks, reducing the development cost. To validate the effectiveness of MPLSandbox, we integrate it into training and deployment approaches, and also employ it to optimize workflows for a wide range of real-world code-related tasks. Our goal is to enhance researcher productivity on LLM-based code-related tasks by simplifying and automating workflows through delegation to MPLSandbox.

Paper Structure

This paper contains 33 sections, 2 equations, 16 figures, 3 tables.

Figures (16)

  • Figure 1: The architecture of MPLSandbox. It comprises three core modules: (1) Multi-Programming Language Sandbox Environment, (2) Code Analysis Module, and (3) Information Integration Module. The Multi-Programming Language Sandbox Environment can provide unified compiler feedback by compiling and executing the code. The Code Analysis Module contains multiple traditional analysis tools to offer a comprehensive analysis report from numerous perspectives. The Information Integration Module integrates compilation feedback and various analysis results to accomplish a range of complex code-related tasks.
  • Figure 2: The pipeline of MPLSandbox. It can be deployed as a standalone system for users or a few LLMs, or as a distributed system for large-scale LLMs' training and deployment.
  • Figure 3: Multi-programming language results of baseline and model trained by PPO. JS and TS denote JavaScript and TypeScript, respectively. We use DeepSeek-Coder-Instruct Guo2024DeepSeekCoderWT as our foundation model and report the Pass@1 results for all programming languages. Through MPLSandbox, users can easily obtain reliable compiler feedback and effortlessly streamline their LLM training workflow.
  • Figure 4: The Form of Configuration
  • Figure 5: The Report of Code Basic Analysis
  • ...and 11 more figures