Aligning Brain Activity with Advanced Transformer Models: Exploring the Role of Punctuation in Semantic Processing
Zenon Lamprou, Frank Polick, Yashar Moshfeghi
TL;DR
The paper investigates how human brain activity aligns with state-of-the-art transformer models and whether punctuation affects semantic processing. Extending the Toneva & Wehbe framework, it evaluates RoBERTa, DistiliBERT, ALBERT, and ELECTRA against fMRI data collected as participants read a chapter from Harry Potter, while also testing four punctuation-removal scenarios. The results show RoBERTa (and to a lesser extent DistiliBERT) achieving the strongest brain alignment, whereas ALBERT and ELECTRA do not surpass BERT; notably, removing punctuation generally improves the brain-to-model mapping. These findings suggest the brain relies less on punctuation for semantic interpretation than the models do, and demonstrate the utility of brain-to-model alignment as a tool to probe and improve language representations in NLP systems.
Abstract
This research examines the congruence between neural activity and advanced transformer models, emphasizing the semantic significance of punctuation in text understanding. Utilizing an innovative approach originally proposed by Toneva and Wehbe, we evaluate four advanced transformer models RoBERTa, DistiliBERT, ALBERT, and ELECTRA against neural activity data. Our findings indicate that RoBERTa exhibits the closest alignment with neural activity, surpassing BERT in accuracy. Furthermore, we investigate the impact of punctuation removal on model performance and neural alignment, revealing that BERT's accuracy enhances in the absence of punctuation. This study contributes to the comprehension of how neural networks represent language and the influence of punctuation on semantic processing within the human brain.
