BabyLM Turns 4 and Goes Multilingual: Call for Papers for the 2026 BabyLM Workshop

Leshem Choshen; Ryan Cotterell; Mustafa Omer Gul; Jaap Jumelet; Tal Linzen; Aaron Mueller; Suchir Salhan; Raj Sanjay Shah; Alex Warstadt; Ethan Gotlieb Wilcox

BabyLM Turns 4 and Goes Multilingual: Call for Papers for the 2026 BabyLM Workshop

Leshem Choshen, Ryan Cotterell, Mustafa Omer Gul, Jaap Jumelet, Tal Linzen, Aaron Mueller, Suchir Salhan, Raj Sanjay Shah, Alex Warstadt, Ethan Gotlieb Wilcox

TL;DR

This work calls for both workshop papers and for researchers to join the 4th BabyLM competition, and offers a new track: Multilingual.

Abstract

The goal of the BabyLM is to stimulate new research connections between cognitive modeling and language model pretraining. We invite contributions in this vein to the BabyLM Workshop, which will also include the 4th iteration of the BabyLM Challenge. As in previous years, the challenge features two ``standard'' tracks (Strict and Strict-Small), in which participants must train language models on under 100M or 10M words of data, respectively. This year, we move beyond our previous English-only pretraining datasets with a new Multilingual track, focusing on English, Dutch, and Chinese. For the workshop, we call for papers related to the overall theme of BabyLM, which includes training efficiency, small-scale training datasets, cognitive modeling, model evaluation, and architecture innovation.

BabyLM Turns 4 and Goes Multilingual: Call for Papers for the 2026 BabyLM Workshop

TL;DR

This work calls for both workshop papers and for researchers to join the 4th BabyLM competition, and offers a new track: Multilingual.

Abstract

Paper Structure (31 sections, 1 table)

This paper contains 31 sections, 1 table.

Introduction: BabyLM
Key Dates
Non-competition Workshop Submissions
Topics
Workshop Theme
Paper submission
Review & Publication
Competition Details
Track Rules
New track: Multilingual
Continuing Tracks
Training Requirements
Training Duration Limitations
Intermediate Checkpoints
Motivation
...and 16 more sections

BabyLM Turns 4 and Goes Multilingual: Call for Papers for the 2026 BabyLM Workshop

TL;DR

Abstract

BabyLM Turns 4 and Goes Multilingual: Call for Papers for the 2026 BabyLM Workshop

Authors

TL;DR

Abstract

Table of Contents