Table of Contents
Fetching ...

TinyTim: A Family of Language Models for Divergent Generation

Christopher J. Agostino

TL;DR

This work introduces TinyTim, a family of language models fine-tuned on Finnegans Wake to induce divergent generation as a counterpoint to standard convergent LLMs. By training TinyTim-V1 from a TinyLlama base and developing TinyTim-V2-IT for instruction-tuned dialogue, the authors demonstrate a robust, highly creative lexical profile, evidenced by a $Yule's ext{ K}$ of $753.3$ for V1 and $120.4$ for V2-IT, alongside vastly increased output variance. Yet, this divergence comes at the cost of factual accuracy on benchmarks such as ARC-Easy, where TinyTim-V2-IT achieves $52 ext{ extpercent}$ compared with baseline performances around $89 ext{ extpercent}-91 ext{ extpercent}$. The results argue for a collaborative AI paradigm where divergent modules provide raw material and novel associations that, when combined with convergent systems, enable radical reframing and automated discovery, with potential applications in co-creative problem solving and human-AI collaboration.

Abstract

In the search for artificial general intelligence, model development and training has focused primarily on vast datasets of known problems and their accepted solutions. This process necessarily produces convergent systems which are fundamentally incapable of the conceptual reframing that is required for genuine creative breakthroughs. Inspired by the divergent cognitive processes that allow humans to make such creative leaps, our work introduces a family of language models, TinyTim, to serve as sources of divergent generation within broader systems. These models have been created by fine-tuning on the anti-parsimonious text of James Joyce's `Finnegans Wake'. Quantitative analysis of both an unsupervised fine-tuned model (TinyTim-V1) and a new instruction-tuned variant (TinyTim-V2) demonstrates a profound capacity for lexical invention; the foundational V1 model exhibits a Yule's K score for lexical richness over twenty times greater than that of convergent baselines. This trait is a stable property of the family, as the instruction-tuned V2 maintains a statistically distinct profile and resists factual convergence, sacrificing benchmark performance to preserve its core generative style. This work establishes a methodology for engineering specialized divergent models that, when paired with convergent systems, can reframe problems and force breakthroughs beyond the reach of statistical optimization alone.

TinyTim: A Family of Language Models for Divergent Generation

TL;DR

This work introduces TinyTim, a family of language models fine-tuned on Finnegans Wake to induce divergent generation as a counterpoint to standard convergent LLMs. By training TinyTim-V1 from a TinyLlama base and developing TinyTim-V2-IT for instruction-tuned dialogue, the authors demonstrate a robust, highly creative lexical profile, evidenced by a of for V1 and for V2-IT, alongside vastly increased output variance. Yet, this divergence comes at the cost of factual accuracy on benchmarks such as ARC-Easy, where TinyTim-V2-IT achieves compared with baseline performances around . The results argue for a collaborative AI paradigm where divergent modules provide raw material and novel associations that, when combined with convergent systems, enable radical reframing and automated discovery, with potential applications in co-creative problem solving and human-AI collaboration.

Abstract

In the search for artificial general intelligence, model development and training has focused primarily on vast datasets of known problems and their accepted solutions. This process necessarily produces convergent systems which are fundamentally incapable of the conceptual reframing that is required for genuine creative breakthroughs. Inspired by the divergent cognitive processes that allow humans to make such creative leaps, our work introduces a family of language models, TinyTim, to serve as sources of divergent generation within broader systems. These models have been created by fine-tuning on the anti-parsimonious text of James Joyce's `Finnegans Wake'. Quantitative analysis of both an unsupervised fine-tuned model (TinyTim-V1) and a new instruction-tuned variant (TinyTim-V2) demonstrates a profound capacity for lexical invention; the foundational V1 model exhibits a Yule's K score for lexical richness over twenty times greater than that of convergent baselines. This trait is a stable property of the family, as the instruction-tuned V2 maintains a statistically distinct profile and resists factual convergence, sacrificing benchmark performance to preserve its core generative style. This work establishes a methodology for engineering specialized divergent models that, when paired with convergent systems, can reframe problems and force breakthroughs beyond the reach of statistical optimization alone.

Paper Structure

This paper contains 10 sections, 3 figures.

Figures (3)

  • Figure 1: Measures of lexical sophistication. The TinyTim family, particularly V1, are clear outliers in lexical invention (Hapax Ratio, Yules K), distinguishing their creative generation from the large-vocabulary retrieval of baselines.
  • Figure 2: Ridge plots showing the distribution of scores for four primary metrics. The baseline models show tight, predictable distributions. Both TinyTim models exhibit extreme variance and long tails, indicative of a divergent generative process. The x-axis represents the score for the corresponding metric.
  • Figure 3: Scatter plots showing relationships between primary metrics. The top-left panel highlights the fundamental trade-off between Token Diversity and Unique Word Ratio, clearly separating the divergent strategy of the TinyTim family from the convergent strategy of the baseline models.