code_transformed: The Influence of Large Language Models on Code

Yuliang Xu; Siming Huang; Mingmeng Geng; Yao Wan; Xuanhua Shi; Dongping Chen

code_transformed: The Influence of Large Language Models on Code

Yuliang Xu, Siming Huang, Mingmeng Geng, Yao Wan, Xuanhua Shi, Dongping Chen

TL;DR

This study assesses whether large language models reshape real-world code by analyzing over 20,000 GitHub repositories linked to arXiv papers from 2020–2025. It presents a comprehensive measurement framework covering naming patterns, cyclomatic and Halstead complexity, maintainability, code similarity, and reasoning-label alignment to quantify LLM-induced style shifts. Key findings indicate LLMs influence human coding style, with increased snake_case usage and longer descriptive names, and that LLM-generated code often exhibits higher maintainability and fewer bugs, particularly when guided by reference solutions. The work demonstrates that LLM-assisted coding leaves detectable traces in real-world repositories and has potential implications for software engineering practice, evaluation, and policy.

Abstract

Coding remains one of the most fundamental modes of interaction between humans and machines. With the rapid advancement of Large Language Models (LLMs), code generation capabilities have begun to significantly reshape programming practices. This development prompts a central question: Have LLMs transformed code style, and how can such transformation be characterized? In this paper, we present a pioneering study that investigates the impact of LLMs on code style, with a focus on naming conventions, complexity, maintainability, and similarity. By analyzing code from over 20,000 GitHub repositories linked to arXiv papers published between 2020 and 2025, we identify measurable trends in the evolution of coding style that align with characteristics of LLM-generated code. For instance, the proportion of snake_case function names in Python code increased from 40.7% in Q1 2023 to 49.8% in Q3 2025. Furthermore, we investigate how LLMs approach algorithmic problems by examining their reasoning processes. Our experimental results may provide the first large-scale empirical evidence that LLMs affect real-world programming style. We release all the experimental dataset and source code at: https://github.com/ignorancex/LLM_code

code_transformed: The Influence of Large Language Models on Code

TL;DR

Abstract

code_transformed: The Influence of Large Language Models on Code

TL;DR

Abstract

Paper Structure

Table of Contents

Figures (18)