Table of Contents
Fetching ...

Learning to Write Rationally: How Information Is Distributed in Non-Native Speakers' Essays

Zixin Tang, Janet G. van Hell

Abstract

People tend to distribute information evenly in language production for better and clearer communication. In this study, we compared essays written by second language learners with various native language (L1) backgrounds to investigate how they distribute information in their non-native language (L2) production. Analyses of surprisal and constancy of entropy rate indicated that writers with higher L2 proficiency can reduce the expected uncertainty of language production while still conveying informative content. However, the uniformity of information distribution showed less variability among different groups of L2 speakers, suggesting that this feature may be universal in L2 essay writing and less affected by L2 writers' variability in L1 background and L2 proficiency.

Learning to Write Rationally: How Information Is Distributed in Non-Native Speakers' Essays

Abstract

People tend to distribute information evenly in language production for better and clearer communication. In this study, we compared essays written by second language learners with various native language (L1) backgrounds to investigate how they distribute information in their non-native language (L2) production. Analyses of surprisal and constancy of entropy rate indicated that writers with higher L2 proficiency can reduce the expected uncertainty of language production while still conveying informative content. However, the uniformity of information distribution showed less variability among different groups of L2 speakers, suggesting that this feature may be universal in L2 essay writing and less affected by L2 writers' variability in L1 background and L2 proficiency.

Paper Structure

This paper contains 16 sections, 2 figures, 6 tables.

Figures (2)

  • Figure 1: Entropy (left) and surprisal (right) values within written essays, categorized by speaker proficiency. The mean values of both metrics are represented by lines.
  • Figure 2: Boxplots of information metrics among non-native speakers' essays. Red lines indicate the mean and 95% distribution among native speakers.