Towards Understanding What Code Language Models Learned
Toufique Ahmed, Dian Yu, Chengxuan Huang, Cathy Wang, Prem Devanbu, Kenji Sagae
TL;DR
This work investigates whether code pre-trained language models capture true computational semantics rather than merely lexical patterns by applying meaning-preserving transformations to code and testing masked-token reconstruction. Comparing code-specialized models (CodeBERT, GraphCodeBERT) with a natural-language baseline (RoBERTa), it shows that CodeBERT and GraphCodeBERT maintain high accuracy on original and transformed code, and that semantically equivalent forms cluster in embedding space, indicating semantic understanding. Key findings include robustness to variable renaming, context-length effects emphasizing following tokens, and a measurable drop under more aggressive condition refactoring, all suggesting that PLMs encode meaningful code semantics rather than surface form alone. The results have implications for evaluating code intelligence, supporting the view that PLMs can develop robust semantic representations with practical impact on code understanding and tooling.
Abstract
Pre-trained language models are effective in a variety of natural language tasks, but it has been argued their capabilities fall short of fully learning meaning or understanding language. To understand the extent to which language models can learn some form of meaning, we investigate their ability to capture semantics of code beyond superficial frequency and co-occurrence. In contrast to previous research on probing models for linguistic features, we study pre-trained models in a setting that allows for objective and straightforward evaluation of a model's ability to learn semantics. In this paper, we examine whether such models capture the semantics of code, which is precisely and formally defined. Through experiments involving the manipulation of code fragments, we show that code pre-trained models of code learn a robust representation of the computational semantics of code that goes beyond superficial features of form alone
