Table of Contents
Fetching ...

Experiences Build Characters: The Linguistic Origins and Functional Impact of LLM Personality

Xi Wang, Mengdie Zhuang, Jiqun Liu

TL;DR

This study establishes a causal link between training data linguistics, such as imperative frequency, and lexical diversity, providing a roadmap for personality engineering and reveals that model competence is bimodal, peaking at "Expressive Generalists" and "Suppressed Specialists", while identifying a "Suppression Advantage" where reduced social traits enhance complex reasoning performance.

Abstract

Human problem-solving is enriched by a diversity of styles and personality traits, yet the development of Large Language Models (LLMs) has largely prioritized uniform performance benchmarks that favour specific behavioural tendencies such as assertiveness. To investigate how diverse experiences shape machine personality and influence problem-solving, this study employs continued pre-training to expose models to domain-specific texts in an unsupervised manner, simulating the accumulation of experience. By adapting the Big Five framework via the Machine Personality Inventory (MPI), we quantify the personality traits of these model variants and analyse their relationship to linguistic style and reasoning behaviour. The findings reveal that model competence is bimodal, peaking at "Expressive Generalists" and "Suppressed Specialists," while identifying a "Suppression Advantage" where reduced social traits enhance complex reasoning performance. This study further establishes a causal link between training data linguistics, such as imperative frequency, and lexical diversity, providing a roadmap for "Personality Engineering".

Experiences Build Characters: The Linguistic Origins and Functional Impact of LLM Personality

TL;DR

This study establishes a causal link between training data linguistics, such as imperative frequency, and lexical diversity, providing a roadmap for personality engineering and reveals that model competence is bimodal, peaking at "Expressive Generalists" and "Suppressed Specialists", while identifying a "Suppression Advantage" where reduced social traits enhance complex reasoning performance.

Abstract

Human problem-solving is enriched by a diversity of styles and personality traits, yet the development of Large Language Models (LLMs) has largely prioritized uniform performance benchmarks that favour specific behavioural tendencies such as assertiveness. To investigate how diverse experiences shape machine personality and influence problem-solving, this study employs continued pre-training to expose models to domain-specific texts in an unsupervised manner, simulating the accumulation of experience. By adapting the Big Five framework via the Machine Personality Inventory (MPI), we quantify the personality traits of these model variants and analyse their relationship to linguistic style and reasoning behaviour. The findings reveal that model competence is bimodal, peaking at "Expressive Generalists" and "Suppressed Specialists," while identifying a "Suppression Advantage" where reduced social traits enhance complex reasoning performance. This study further establishes a causal link between training data linguistics, such as imperative frequency, and lexical diversity, providing a roadmap for "Personality Engineering".
Paper Structure (18 sections, 3 equations, 2 figures, 5 tables)

This paper contains 18 sections, 3 equations, 2 figures, 5 tables.

Figures (2)

  • Figure 1: The Polarization of Competence. (a) Latent personality space is defined by Expressiveness and Social Assurance. (b-c) Performance is maximized when dimensions are aligned. Black boundary lines in (b) and (c) illustrate a "Congruence Zone": models succeed either as Suppressed/Stable Tools or Expressive/Social Agents (in between two lines).
  • Figure 2: The Correlation of Personality Effects under Complexity.