Transfer of Structural Knowledge from Synthetic Languages
Mikhail Budnikov, Ivan Yamshchikov
TL;DR
The paper analyzes transfer learning from synthetic languages to English, introducing flat_shuffle and the Tiny-Cloze Benchmark for evaluating simple NLU in small models. Using transfer-difficulty measures, embedding-spectrum analyses, and linear probes, it shows that flat_shuffle provides the strongest transfer while revealing how embedding structure evolves during fine-tuning. It also emphasizes data properties as a lever for data-efficient NLP and discusses limitations of single-target evaluation and small-scale models.
Abstract
This work explores transfer learning from several synthetic languages to English. We investigate the structure of the embeddings in the fine-tuned models, the information they contain, and the capabilities of the fine-tuned models on simple linguistic tasks. We also introduce a new synthetic language that leads to better transfer to English than the languages used in previous research. Finally, we introduce Tiny-Cloze Benchmark - a new synthetic benchmark for natural language understanding that is more informative for less powerful models. We use Tiny-Cloze Benchmark to evaluate fine-tuned models in several domains demonstrating that fine-tuning on a new synthetic language allows for better performance on a variety of tasks.
