Table of Contents
Fetching ...

MPCODER: Multi-user Personalized Code Generator with Explicit and Implicit Style Representation Learning

Zhenlong Dai, Chang Yao, WenKang Han, Ying Yuan, Zhipeng Gao, Jingyuan Chen

TL;DR

This work proposed MPCoder (Multi-user Personalized Code Generator) to generate personalized code for multiple users to better learn coding style features, and trains a multi-user style adapter to better differentiate the implicit feature representations of different users through contrastive learning.

Abstract

Large Language Models (LLMs) have demonstrated great potential for assisting developers in their daily development. However, most research focuses on generating correct code, how to use LLMs to generate personalized code has seldom been investigated. To bridge this gap, we proposed MPCoder (Multi-user Personalized Code Generator) to generate personalized code for multiple users. To better learn coding style features, we utilize explicit coding style residual learning to capture the syntax code style standards and implicit style learning to capture the semantic code style conventions. We train a multi-user style adapter to better differentiate the implicit feature representations of different users through contrastive learning, ultimately enabling personalized code generation for multiple users. We further propose a novel evaluation metric for estimating similarities between codes of different coding styles. The experimental results show the effectiveness of our approach for this novel task.

MPCODER: Multi-user Personalized Code Generator with Explicit and Implicit Style Representation Learning

TL;DR

This work proposed MPCoder (Multi-user Personalized Code Generator) to generate personalized code for multiple users to better learn coding style features, and trains a multi-user style adapter to better differentiate the implicit feature representations of different users through contrastive learning.

Abstract

Large Language Models (LLMs) have demonstrated great potential for assisting developers in their daily development. However, most research focuses on generating correct code, how to use LLMs to generate personalized code has seldom been investigated. To bridge this gap, we proposed MPCoder (Multi-user Personalized Code Generator) to generate personalized code for multiple users. To better learn coding style features, we utilize explicit coding style residual learning to capture the syntax code style standards and implicit style learning to capture the semantic code style conventions. We train a multi-user style adapter to better differentiate the implicit feature representations of different users through contrastive learning, ultimately enabling personalized code generation for multiple users. We further propose a novel evaluation metric for estimating similarities between codes of different coding styles. The experimental results show the effectiveness of our approach for this novel task.
Paper Structure (45 sections, 13 equations, 11 figures, 8 tables)

This paper contains 45 sections, 13 equations, 11 figures, 8 tables.

Figures (11)

  • Figure 1: Example of code generated by LLMs and the corresponding personalized code that is expected, with areas inconsistent with the expectations marked in different colors within the model-generated code.
  • Figure 2: Overview of MPCoder. (a) illustrates the structure of the multi-user style adapter. (b) is the second training stage of MPCoder at the decoding step $t$.
  • Figure 3: Explicit coding style residual learning.
  • Figure 4: Residual Learning (left) and Contrastive Learning Hyperparameter (right) Effects on CSS.
  • Figure 5: t-SNE visualization results: style hidden states of user interaction records on PCIDense. Different colors denote different users.
  • ...and 6 more figures