TinyTim: A Family of Language Models for Divergent Generation

Christopher J. Agostino

TinyTim: A Family of Language Models for Divergent Generation

Christopher J. Agostino

TL;DR

This work introduces TinyTim, a family of language models fine-tuned on Finnegans Wake to induce divergent generation as a counterpoint to standard convergent LLMs. By training TinyTim-V1 from a TinyLlama base and developing TinyTim-V2-IT for instruction-tuned dialogue, the authors demonstrate a robust, highly creative lexical profile, evidenced by a $Yule's ext{ K}$ of $753.3$ for V1 and $120.4$ for V2-IT, alongside vastly increased output variance. Yet, this divergence comes at the cost of factual accuracy on benchmarks such as ARC-Easy, where TinyTim-V2-IT achieves $52 ext{ extpercent}$ compared with baseline performances around $89 ext{ extpercent}-91 ext{ extpercent}$. The results argue for a collaborative AI paradigm where divergent modules provide raw material and novel associations that, when combined with convergent systems, enable radical reframing and automated discovery, with potential applications in co-creative problem solving and human-AI collaboration.

Abstract

In the search for artificial general intelligence, model development and training has focused primarily on vast datasets of known problems and their accepted solutions. This process necessarily produces convergent systems which are fundamentally incapable of the conceptual reframing that is required for genuine creative breakthroughs. Inspired by the divergent cognitive processes that allow humans to make such creative leaps, our work introduces a family of language models, TinyTim, to serve as sources of divergent generation within broader systems. These models have been created by fine-tuning on the anti-parsimonious text of James Joyce's `Finnegans Wake'. Quantitative analysis of both an unsupervised fine-tuned model (TinyTim-V1) and a new instruction-tuned variant (TinyTim-V2) demonstrates a profound capacity for lexical invention; the foundational V1 model exhibits a Yule's K score for lexical richness over twenty times greater than that of convergent baselines. This trait is a stable property of the family, as the instruction-tuned V2 maintains a statistically distinct profile and resists factual convergence, sacrificing benchmark performance to preserve its core generative style. This work establishes a methodology for engineering specialized divergent models that, when paired with convergent systems, can reframe problems and force breakthroughs beyond the reach of statistical optimization alone.

TinyTim: A Family of Language Models for Divergent Generation

TL;DR

Abstract

TinyTim: A Family of Language Models for Divergent Generation

TL;DR

Abstract

Paper Structure

Table of Contents

Figures (3)