Table of Contents
Fetching ...

LOLAMEME: Logic, Language, Memory, Mechanistic Framework

Jay Desai, Xiaobo Guo, Srinivasan H. Sengamedu

TL;DR

LoLaMeMe addresses the challenge of mechanistic interpretability in large language models by introducing a controllable framework that models logic, memory, and latent structure in language. The approach defines two languages, LoLa and MeMe, and evaluates transformer-based GPT-2 and convolution-based Hyena architectures, including a hybrid T Hex that interleaves blocks from both. Across memorization, variable-length inputs, operator learning, multilingual data, and standard benchmarks like Listops, the T Hex architecture often outperforms the baselines, indicating that selective architectural mixing can enhance mechanistic capabilities. This work provides a principled, tunable testbed for analyzing language phenomena and informs the design of more interpretable and capable language systems.

Abstract

The performance of Large Language Models has achieved superhuman breadth with unprecedented depth. At the same time, the language models are mostly black box models and the underlying mechanisms for performance have been evaluated using synthetic or mechanistic schemes. We extend current mechanistic schemes to incorporate Logic, memory, and nuances of Language such as latent structure. The proposed framework is called LOLAMEME and we provide two instantiations of LOLAMEME: LoLa and MeMe languages. We then consider two generative language model architectures: transformer-based GPT-2 and convolution-based Hyena. We propose the hybrid architecture T HEX and use LOLAMEME framework is used to compare three architectures. T HEX outperforms GPT-2 and Hyena on select tasks.

LOLAMEME: Logic, Language, Memory, Mechanistic Framework

TL;DR

LoLaMeMe addresses the challenge of mechanistic interpretability in large language models by introducing a controllable framework that models logic, memory, and latent structure in language. The approach defines two languages, LoLa and MeMe, and evaluates transformer-based GPT-2 and convolution-based Hyena architectures, including a hybrid T Hex that interleaves blocks from both. Across memorization, variable-length inputs, operator learning, multilingual data, and standard benchmarks like Listops, the T Hex architecture often outperforms the baselines, indicating that selective architectural mixing can enhance mechanistic capabilities. This work provides a principled, tunable testbed for analyzing language phenomena and informs the design of more interpretable and capable language systems.

Abstract

The performance of Large Language Models has achieved superhuman breadth with unprecedented depth. At the same time, the language models are mostly black box models and the underlying mechanisms for performance have been evaluated using synthetic or mechanistic schemes. We extend current mechanistic schemes to incorporate Logic, memory, and nuances of Language such as latent structure. The proposed framework is called LOLAMEME and we provide two instantiations of LOLAMEME: LoLa and MeMe languages. We then consider two generative language model architectures: transformer-based GPT-2 and convolution-based Hyena. We propose the hybrid architecture T HEX and use LOLAMEME framework is used to compare three architectures. T HEX outperforms GPT-2 and Hyena on select tasks.
Paper Structure (27 sections, 1 equation, 4 figures, 5 tables)

This paper contains 27 sections, 1 equation, 4 figures, 5 tables.

Figures (4)

  • Figure 1: Memorization performance comparison between Hyena and GPT-2. This dataset includes global variables.
  • Figure 2: Memorization performance comparison between Hyena and GPT-2. This dataset does not includes global variables.
  • Figure 3: Exact match and loss for GPT-2 and Hyena with different variable Length. In figure, we provide the range of the variable length, and the mean length in the format of mean [min,max]. This is on dataset with (left) and without (right) global variables.
  • Figure 4: Loss for T Hex , GPT-2 and Hyena with different variable Length. In figure, we provide the range of the variable length, and the mean length in the format of mean [min,max]. This is on dataset with (left) and without (right) global variables.