Demo-Craft: Using In-Context Learning to Improve Code Generation in Large Language Models
Nirmal Joshua Kapu, Mihit Sreejith
TL;DR
DemoCraft tackles code generation from natural language under semantic ambiguity by introducing latent concept learning and a probabilistic demonstration selector. It uses three components—latent concept learning, task-concept probability calculation, and demonstration selection—to tailor demonstrations to a target task. On MBPP and HumanEval with SantaCoder, it yields about a $2\times$ gain in pass@k and around $3\times$ gains in correctness@k and similarity@k, outperforming semantic and random baselines. The approach demonstrates that task-specific concept tokens can substantially improve executable-code generation and may scale to larger models and broader domains.
Abstract
Generating executable code from natural language instructions using Large Language Models (LLMs) poses challenges such as semantic ambiguity and understanding taskspecific contexts. To address these issues, we propose a system called DemoCraft, which enhances code generation by leveraging in-context learning and demonstration selection, combined with latent concept learning. Latent concept learning introduces additional concept tokens, which are trainable embeddings that capture task-specific knowledge. We then test our system on two major datasets: MBPP and Humaneval. Our experimental results demonstrate that the proposed system achieves an approximate 2x increase in the pass@k metric compared to baseline models. Furthermore, we introduce two novel evaluation metrics: correctness@k and similarity@k. Our empirical studies indicate that our system attains nearly a 3x improvement in these metrics as well.
