Table of Contents
Fetching ...

Superposition in Transformers: A Novel Way of Building Mixture of Experts

Ayoub Ben Chaliah, Hela Dellagi

TL;DR

Catastrophic forgetting during fine-tuning of large language models is mitigated by Superposition in Transformers, which merges a base model $M_{\text{base}}$ and a fine-tuned model $M_{\text{fine}}$ into a single merged model using layer-wise B-spline blending with coefficients $\alpha(l)$ and autoencoders that reconstruct hidden states. By freezing both experts and training only the blending coefficients, autoencoders, and related biases, the method preserves existing capabilities while adding compact domain-specific representations that can be selectively activated. The key contributions are (i) autoencoder-based reconstruction enabling in-model superposition, (ii) jointly learned B-spline blending, and (iii) parameter efficiency with minimal overhead, plus an optional 2D-$\alpha$ extension and dynamic switching. The approach shows promising results in reducing forgetting, improving cross-domain perplexity and alignment of internal representations, and enabling future multilingual, symbolic-reasoning, and multi-domain integration with limited parameter growth.

Abstract

Catastrophic forgetting remains a major challenge when adapting large language models (LLMs) to new tasks or domains. Conventional fine-tuning often overwrites existing knowledge, causing performance degradation on original tasks. We introduce Superposition in Transformers, a novel architecture that leverages autoencoders to superimpose the hidden representations of a base model and a fine-tuned model within a shared parameter space. By using B-spline-based blending coefficients and autoencoders that adaptively reconstruct hidden states based on the input data distribution, our method effectively mitigates catastrophic forgetting and enables a new paradigm of "in-model" superposition. This approach preserves original model capabilities while allowing compact domain-specific expertise to be added, and it supports dynamic switching between model states during inference.

Superposition in Transformers: A Novel Way of Building Mixture of Experts

TL;DR

Catastrophic forgetting during fine-tuning of large language models is mitigated by Superposition in Transformers, which merges a base model and a fine-tuned model into a single merged model using layer-wise B-spline blending with coefficients and autoencoders that reconstruct hidden states. By freezing both experts and training only the blending coefficients, autoencoders, and related biases, the method preserves existing capabilities while adding compact domain-specific representations that can be selectively activated. The key contributions are (i) autoencoder-based reconstruction enabling in-model superposition, (ii) jointly learned B-spline blending, and (iii) parameter efficiency with minimal overhead, plus an optional 2D- extension and dynamic switching. The approach shows promising results in reducing forgetting, improving cross-domain perplexity and alignment of internal representations, and enabling future multilingual, symbolic-reasoning, and multi-domain integration with limited parameter growth.

Abstract

Catastrophic forgetting remains a major challenge when adapting large language models (LLMs) to new tasks or domains. Conventional fine-tuning often overwrites existing knowledge, causing performance degradation on original tasks. We introduce Superposition in Transformers, a novel architecture that leverages autoencoders to superimpose the hidden representations of a base model and a fine-tuned model within a shared parameter space. By using B-spline-based blending coefficients and autoencoders that adaptively reconstruct hidden states based on the input data distribution, our method effectively mitigates catastrophic forgetting and enables a new paradigm of "in-model" superposition. This approach preserves original model capabilities while allowing compact domain-specific expertise to be added, and it supports dynamic switching between model states during inference.
Paper Structure (45 sections, 9 equations, 8 figures, 2 tables, 2 algorithms)

This paper contains 45 sections, 9 equations, 8 figures, 2 tables, 2 algorithms.

Figures (8)

  • Figure 1: Overview of a GPT-2 Merged Model Architecture.
  • Figure 2: Perplexity evolution across epochs for different merging methods.
  • Figure 3: t-SNE visualization of layer 4 hidden states from the merged model and expert models for English and French inputs (2D-alpha).
  • Figure 4: Comparison of average neuron diversity across layers for the base, fine-tuned, and merged models.
  • Figure 5: Comparison of mean neuron activation across layers for the base, fine-tuned, and merged models.
  • ...and 3 more figures