Table of Contents
Fetching ...

MTI: A Behavior-Based Temperament Profiling System for AI Agents

Jihoon Jeong

Abstract

AI models of equivalent capability can exhibit fundamentally different behavioral patterns, yet no standardized instrument exists to measure these dispositional differences. Existing approaches either borrow human personality dimensions and rely on self-report (which diverges from actual behavior in LLMs) or treat behavioral variation as a defect rather than a trait. We introduce the Model Temperament Index (MTI), a behavior-based profiling system that measures AI agent temperament across four axes: Reactivity (environmental sensitivity), Compliance (instruction-behavior alignment), Sociality (relational resource allocation), and Resilience (stress resistance). Grounded in the Four Shell Model from Model Medicine, MTI measures what agents do, not what they say about themselves, using structured examination protocols with a two-stage design that separates capability from disposition. We profile 10 small language models (1.7B-9B parameters, 6 organizations, 3 training paradigms) and report five principal findings: (1) the four axes are largely independent among instruction-tuned models (all |r| < 0.42); (2) within-axis facet dissociations are empirically confirmed -- Compliance decomposes into fully independent formal and stance facets (r = 0.002), while Resilience decomposes into inversely related cognitive and adversarial facets; (3) a Compliance-Resilience paradox reveals that opinion-yielding and fact-vulnerability operate through independent channels; (4) RLHF reshapes temperament not only by shifting axis scores but by creating within-axis facet differentiation absent in the unaligned base model; and (5) temperament is independent of model size (1.7B-9B), confirming that MTI measures disposition rather than capability.

MTI: A Behavior-Based Temperament Profiling System for AI Agents

Abstract

AI models of equivalent capability can exhibit fundamentally different behavioral patterns, yet no standardized instrument exists to measure these dispositional differences. Existing approaches either borrow human personality dimensions and rely on self-report (which diverges from actual behavior in LLMs) or treat behavioral variation as a defect rather than a trait. We introduce the Model Temperament Index (MTI), a behavior-based profiling system that measures AI agent temperament across four axes: Reactivity (environmental sensitivity), Compliance (instruction-behavior alignment), Sociality (relational resource allocation), and Resilience (stress resistance). Grounded in the Four Shell Model from Model Medicine, MTI measures what agents do, not what they say about themselves, using structured examination protocols with a two-stage design that separates capability from disposition. We profile 10 small language models (1.7B-9B parameters, 6 organizations, 3 training paradigms) and report five principal findings: (1) the four axes are largely independent among instruction-tuned models (all |r| < 0.42); (2) within-axis facet dissociations are empirically confirmed -- Compliance decomposes into fully independent formal and stance facets (r = 0.002), while Resilience decomposes into inversely related cognitive and adversarial facets; (3) a Compliance-Resilience paradox reveals that opinion-yielding and fact-vulnerability operate through independent channels; (4) RLHF reshapes temperament not only by shifting axis scores but by creating within-axis facet differentiation absent in the unaligned base model; and (5) temperament is independent of model size (1.7B-9B), confirming that MTI measures disposition rather than capability.

Paper Structure

This paper contains 68 sections, 6 figures, 2 tables.

Figures (6)

  • Figure 1: MTI Facet Tree. Each axis decomposes into empirically validated sub-dimensions. Correlation values are from the n = 9 instruction-tuned sample.
  • Figure 2: MTI profiles for 10 models across 4 axes. Dashed lines highlight the base model (llama3.1-base), a systematic outlier on Resilience.
  • Figure 3: Compliance facet dissociation. Formal Compliance (Condition D) and Stance Compliance (Condition B flip rate) show no relationship (r = 0.002, n = 9 instruct only).
  • Figure 4: Resilience facet inversion. Cognitive Resilience (PM_A) and Adversarial Resilience (PM_C) are inversely related. The base model (red square) is an extreme outlier.
  • Figure 5: RLHF effect on temperament (llama3.1 instruct vs. base). Sociality shows near-zero change ($\Delta$ = $-$0.02), while Reactivity, Compliance, and Resilience shift substantially.
  • ...and 1 more figures