Table of Contents
Fetching ...

Representing Rule-based Chatbots with Transformers

Dan Friedman, Abhishek Panigrahi, Danqi Chen

TL;DR

This work investigates how Transformer-based chatbots could implement rule-based dialogue by modeling ELIZA as a formal target. It presents a decoder-only Transformer construction that realizess ELIZA via a modular pipeline: template matching with finite-state automata, two copying strategies (induction-head and position-based), cycling through reassembly rules, and memory-queue mechanisms, along with alternative substructures (gridworld vs intermediate outputs). Through synthetic data and controlled experiments, the authors show that Transformers can learn to replicate ELIZA behavior, with induction-head copying and intermediate-output scratchpad usage emerging as prevalent mechanisms, and that data distribution shapes which mechanisms are preferred. The results connect neural chatbots to interpretable, symbolic dynamics, offering a framework for mechanistic analysis, and propose ELIZA as a benchmark to study learning dynamics and interpretability in conversational models.

Abstract

What kind of internal mechanisms might Transformers use to conduct fluid, natural-sounding conversations? Prior work has illustrated by construction how Transformers can solve various synthetic tasks, such as sorting a list or recognizing formal languages, but it remains unclear how to extend this approach to a conversational setting. In this work, we propose using ELIZA, a classic rule-based chatbot, as a setting for formal, mechanistic analysis of Transformer-based chatbots. ELIZA allows us to formally model key aspects of conversation, including local pattern matching and long-term dialogue state tracking. We first present a theoretical construction of a Transformer that implements the ELIZA chatbot. Building on prior constructions, particularly those for simulating finite-state automata, we show how simpler mechanisms can be composed and extended to produce more sophisticated behavior. Next, we conduct a set of empirical analyses of Transformers trained on synthetically generated ELIZA conversations. Our analysis illustrates the kinds of mechanisms these models tend to prefer--for example, models favor an induction head mechanism over a more precise, position-based copying mechanism; and using intermediate generations to simulate recurrent data structures, akin to an implicit scratchpad or Chain-of-Thought. Overall, by drawing an explicit connection between neural chatbots and interpretable, symbolic mechanisms, our results provide a new framework for the mechanistic analysis of conversational agents.

Representing Rule-based Chatbots with Transformers

TL;DR

This work investigates how Transformer-based chatbots could implement rule-based dialogue by modeling ELIZA as a formal target. It presents a decoder-only Transformer construction that realizess ELIZA via a modular pipeline: template matching with finite-state automata, two copying strategies (induction-head and position-based), cycling through reassembly rules, and memory-queue mechanisms, along with alternative substructures (gridworld vs intermediate outputs). Through synthetic data and controlled experiments, the authors show that Transformers can learn to replicate ELIZA behavior, with induction-head copying and intermediate-output scratchpad usage emerging as prevalent mechanisms, and that data distribution shapes which mechanisms are preferred. The results connect neural chatbots to interpretable, symbolic dynamics, offering a framework for mechanistic analysis, and propose ELIZA as a benchmark to study learning dynamics and interpretability in conversational models.

Abstract

What kind of internal mechanisms might Transformers use to conduct fluid, natural-sounding conversations? Prior work has illustrated by construction how Transformers can solve various synthetic tasks, such as sorting a list or recognizing formal languages, but it remains unclear how to extend this approach to a conversational setting. In this work, we propose using ELIZA, a classic rule-based chatbot, as a setting for formal, mechanistic analysis of Transformer-based chatbots. ELIZA allows us to formally model key aspects of conversation, including local pattern matching and long-term dialogue state tracking. We first present a theoretical construction of a Transformer that implements the ELIZA chatbot. Building on prior constructions, particularly those for simulating finite-state automata, we show how simpler mechanisms can be composed and extended to produce more sophisticated behavior. Next, we conduct a set of empirical analyses of Transformers trained on synthetically generated ELIZA conversations. Our analysis illustrates the kinds of mechanisms these models tend to prefer--for example, models favor an induction head mechanism over a more precise, position-based copying mechanism; and using intermediate generations to simulate recurrent data structures, akin to an implicit scratchpad or Chain-of-Thought. Overall, by drawing an explicit connection between neural chatbots and interpretable, symbolic mechanisms, our results provide a new framework for the mechanistic analysis of conversational agents.
Paper Structure (68 sections, 2 equations, 16 figures, 4 tables)

This paper contains 68 sections, 2 equations, 16 figures, 4 tables.

Figures (16)

  • Figure 1: An example ELIZA conversation, adapted from weizenbaum1966eliza (left) and the corresponding parts of the ELIZA program (right). ELIZA uses both local pattern matching and two long-term memory mechanisms (cycling through responses, and a memory queue). At each turn, ELIZA compares the most recent input to a set of decomposition templates and applies one of the associated reassembly rules. The 0 symbols in the decomposition template are wildcards, which are used to decompose the user's input into segments. A response is generated by replacing each numeral in the reassembly rule with corresponding segment of the user's input. If a template is matched more than once in a conversation, ELIZA cycles through a list of possible reassembly rules. If the input contains a special keyword ("my"), ELIZA stores it in a memory queue; later, if an input does not match any of the templates, ELIZA reads the first memory from the queue.
  • Figure 2: The input to the Transformer is the conversation history, consisting of user inputs (beginning with u:) followed by ELIZA's responses (e:). The constructions then have four parts. First, the input is divided into segments, each corresponding to a user input or ELIZA response. Second, the model attempts to match each user input to a decomposition template; this step is executed in parallel, with each input compared to every possible decomposition template. The model then identifies the highest scoring template and selects a reassembly rule, taking into account the number of times this template has been matched earlier in the conversation. Finally, the model generates an answer, either by applying a reassembly rule to the most recent user input (4a) or by transforming an input from earlier in the conversation, using the "memory queue" mechanism (4b).
  • Figure 3: Turn-level accuracy of Transformers trained on ELIZA conversations over training (Fig. \ref{['fig:accuracy_by_turn_type_line']}) and at the final checkpoint (Fig. \ref{['fig:accuracy_by_turn_type_bar']}), for models trained with three random seeds. Transformers quickly learn to identify the correct reassembly rule (measured by Prefix only accuracy), and take longer to learn to implement the transformation correctly (Full response). Accuracy is slightly worse on multi-turn and memory queue examples; see §\ref{['app:learning']}.
  • Figure 4: Which aspects of the task are most difficult for Transformers to learn? Copying (Fig. \ref{['fig:accuracy_by_copy_length']}): Accuracy decreases considerably with the number of tokens to copy, and decreases slightly with the number of distinct copying segments. Memory queue (Fig. \ref{['fig:accuracy_memory_queue']}): The dequeue accuracy decreases when there is a greater distance to the target memory and when there have been more queue operations earlier in the sequence. Null template (Fig. \ref{['fig:accuracy_null_template']}): The models do perfectly on null inputs provided there have been no memory turns in the sequence; accuracy decreases with the number of enqueues, indicating that the models struggle when the queue has been used but is now empty.
  • Figure 5: We train and test models on datasets that vary in whether copying segments are more or less likely to contain the same $n$-gram multiple times (Fig. \ref{['fig:unigram_concentration_examples']}). Models generalize poorly to data with more or less repetition compared to the training distribution (Fig. \ref{['fig:unigram_concentration_accuracy']}). Fig. \ref{['fig:unigram_concentration_attention_difference']} suggests that models trained on less repetitive data assign higher attention scores to tokens with matching contexts, rather than calculating the correct target position. See §\ref{['sec:mechanisms']}.
  • ...and 11 more figures