Table of Contents
Fetching ...

Unexpected but informative: What fixation-related potentials tell us about the processing of confusing program code

Annabelle Bergum, Anna-Maria Maurer, Norman Peitek, Regine Bader, Axel Mecklinger, Vera Demberg, Janet Siegmund, Sven Apel

TL;DR

This study investigates how programmers process confusing code constructs (atoms of confusion) by recording fixation-related potentials (FRPs) during naturalistic code reading. Using a data-driven FRP approach, the authors compare confusing and clean code, finding a robust late frontal positivity starting around $390$–$660$ ms after first fixation on the critical region, along with slower comprehension and reduced accuracy for confusing snippets. The results suggest that the brain employs neurocognitive mechanisms akin to those in natural language processing to update the situation model in response to unexpected but plausible inputs, with implications for coding conventions, programming language design, and education. The work demonstrates the ecological validity of FRP methods in software engineering and paves the way for interdisciplinary collaboration between software engineering and psycholinguistics.

Abstract

As software pervades more and more areas of our professional and personal lives, there is an ever-increasing need to maintain software and for programmers to efficiently write and understand program code. In the first study of its kind, we analyze fixation-related potentials (FRPs) to explore the online processing of program code patterns that are confusing to programmers, but not to the computer (so-called atoms of confusion), and their underlying neurocognitive mechanisms in an ecologically valid setting. Relative to clean counterparts in program code without an atom of confusion, confusing code elicits a late frontal positivity of about 400 to 700 ms after first looking at the atom of confusion. This frontal positivity resembles an event-related potential (ERP) component found during natural language processing that is elicited by unexpected but plausible words in sentence context. Thus, we suggest that the brain engages similar neurocognitive mechanisms in response to unexpected and informative inputs in program code and in natural language. In both domains, these inputs update a comprehender's situation model, which is essential for information extraction from a quickly unfolding input. Our results have far-reaching implications for programming and pave the way for interdisciplinary collaborations between software engineering and psycholinguistics.

Unexpected but informative: What fixation-related potentials tell us about the processing of confusing program code

TL;DR

This study investigates how programmers process confusing code constructs (atoms of confusion) by recording fixation-related potentials (FRPs) during naturalistic code reading. Using a data-driven FRP approach, the authors compare confusing and clean code, finding a robust late frontal positivity starting around ms after first fixation on the critical region, along with slower comprehension and reduced accuracy for confusing snippets. The results suggest that the brain employs neurocognitive mechanisms akin to those in natural language processing to update the situation model in response to unexpected but plausible inputs, with implications for coding conventions, programming language design, and education. The work demonstrates the ecological validity of FRP methods in software engineering and paves the way for interdisciplinary collaboration between software engineering and psycholinguistics.

Abstract

As software pervades more and more areas of our professional and personal lives, there is an ever-increasing need to maintain software and for programmers to efficiently write and understand program code. In the first study of its kind, we analyze fixation-related potentials (FRPs) to explore the online processing of program code patterns that are confusing to programmers, but not to the computer (so-called atoms of confusion), and their underlying neurocognitive mechanisms in an ecologically valid setting. Relative to clean counterparts in program code without an atom of confusion, confusing code elicits a late frontal positivity of about 400 to 700 ms after first looking at the atom of confusion. This frontal positivity resembles an event-related potential (ERP) component found during natural language processing that is elicited by unexpected but plausible words in sentence context. Thus, we suggest that the brain engages similar neurocognitive mechanisms in response to unexpected and informative inputs in program code and in natural language. In both domains, these inputs update a comprehender's situation model, which is essential for information extraction from a quickly unfolding input. Our results have far-reaching implications for programming and pave the way for interdisciplinary collaborations between software engineering and psycholinguistics.

Paper Structure

This paper contains 19 sections, 3 figures, 2 tables.

Figures (3)

  • Figure 1: A pair of corresponding code snippets with one containing an atom of confusion (left) and the other containing functionally equivalent code that is easy to understand (right).
  • Figure 2: Overview of the experiment setup and design (EEG Icon by Freepik). Part (a) displays a symbolic image with the experiment setup of a participant with keyboard and EEG cap in front of an eye tracker attached to the bottom of the screen display. Part (b) visualizes the experiment design. The experiment included three blocks with $24$ trials each: fixation cross ($5~\text{s}$ to calibrate); snippet presentation ($3$--$30~\text{s}$, terminated by participant; the duration of snippet presentation is measured as comprehension time); answer correctness (the participant has to submit the output, i.e., the final value in variable ); subjective difficulty rating (the participant has to indicate their perceived difficulty of performing the task); short break ($5$ or $20~\text{s}$) to reduce fatigue effects. After the block is finished, there is a long break of $10~\text{min}$ to reduce learning and fatigue effects.
  • Figure 3: Results of the FRP analysis. Part (a) shows the FRP elicited by confusing (orange) and clean (blue) program code at nine scalp electrodes. The zero time points denote the onset of the fixation and positive voltages are plotted upwards. Part (b) portrays the topographic distribution of the amplitude difference between confusing and clean code in consecutive $50~\text{ms}$ time intervals. The onset of the time interval is indicated above each map. As apparent from the figure, the frontal slow wave is most pronounced at frontal and left frontal recording sites. Part (c) shows the amplitude differences between confusing and clean program code for all electrodes under investigation and all time points between $100$ and $1,000~\text{ms}$ after fixation onset. The significant cluster is surrounded by black lines. Note that the late amplitude differences at frontal recordings did not reach the cluster-based significance threshold.