I Have No Mouth, and I Must Rhyme: Uncovering Internal Phonetic Representations in LLaMA 3.2

Oliver McLaughlin; Arjun Khurana; Jack Merullo

I Have No Mouth, and I Must Rhyme: Uncovering Internal Phonetic Representations in LLaMA 3.2

Oliver McLaughlin, Arjun Khurana, Jack Merullo

TL;DR

This work tackles whether Llama-3.2-1B-Instruct encodes internal phonetic representations capable of performing phonetic tasks like rhyming without audio grounding. It employs linear probes, causal embedding interventions of the form $E = E + c(\mu - \xi)$, and activation patching to uncover a dedicated phoneme mover head that promotes phonetic information into final predictions. The results show recoverable phoneme directions in token embeddings, cross-lingual phoneme promotion, and a PCA-driven vowel geometry that forms an emergent vowel chart, with partial alignment to human IPA but model-specific deviations. These findings illuminate a tangible internal phonetic model in Llama and have implications for interpretability and cross-lingual phonetic reasoning in large language models.

Abstract

Large language models demonstrate proficiency on phonetic tasks, such as rhyming, without explicit phonetic or auditory grounding. In this work, we investigate how \verb|Llama-3.2-1B-Instruct| represents token-level phonetic information. Our results suggest that Llama uses a rich internal model of phonemes to complete phonetic tasks. We provide evidence for high-level organization of phoneme representations in its latent space. In doing so, we also identify a ``phoneme mover head" which promotes phonetic information during rhyming tasks. We visualize the output space of this head and find that, while notable differences exist, Llama learns a model of vowels similar to the standard IPA vowel chart for humans, despite receiving no direct supervision to do so.

I Have No Mouth, and I Must Rhyme: Uncovering Internal Phonetic Representations in LLaMA 3.2

TL;DR

Abstract

I Have No Mouth, and I Must Rhyme: Uncovering Internal Phonetic Representations in LLaMA 3.2

TL;DR

Abstract

Paper Structure

Table of Contents

Figures (9)