Table of Contents
Fetching ...

Interaction Matters: An Evaluation Framework for Interactive Dialogue Assessment on English Second Language Conversations

Rena Gao, Carsten Roever, Jey Han Lau

TL;DR

The paper addresses the lack of ESL dialogue evaluation datasets by introducing SLEDE and a two-level annotation framework that combines four dialogue-level interactivity labels with 17 micro-level linguistic features. It demonstrates that micro-level cues can strongly predict interactivity quality, often outperforming raw-text baselines like BERT, via predictive modeling with LR, RF, and NB (and minimal reliance on deep pretraining). Key contributions include the annotated corpus, a transparent evaluation framework, and insights into high-impact and label-specific micro-features, with implications for ESL assessment and feedback. The work advances fine-grained ESL dialogue evaluation and offers a pathway toward more informative language proficiency assessment tied to interactive communication skills.

Abstract

We present an evaluation framework for interactive dialogue assessment in the context of English as a Second Language (ESL) speakers. Our framework collects dialogue-level interactivity labels (e.g., topic management; 4 labels in total) and micro-level span features (e.g., backchannels; 17 features in total). Given our annotated data, we study how the micro-level features influence the (higher level) interactivity quality of ESL dialogues by constructing various machine learning-based models. Our results demonstrate that certain micro-level features strongly correlate with interactivity quality, like reference word (e.g., she, her, he), revealing new insights about the interaction between higher-level dialogue quality and lower-level linguistic signals. Our framework also provides a means to assess ESL communication, which is useful for language assessment.

Interaction Matters: An Evaluation Framework for Interactive Dialogue Assessment on English Second Language Conversations

TL;DR

The paper addresses the lack of ESL dialogue evaluation datasets by introducing SLEDE and a two-level annotation framework that combines four dialogue-level interactivity labels with 17 micro-level linguistic features. It demonstrates that micro-level cues can strongly predict interactivity quality, often outperforming raw-text baselines like BERT, via predictive modeling with LR, RF, and NB (and minimal reliance on deep pretraining). Key contributions include the annotated corpus, a transparent evaluation framework, and insights into high-impact and label-specific micro-features, with implications for ESL assessment and feedback. The work advances fine-grained ESL dialogue evaluation and offers a pathway toward more informative language proficiency assessment tied to interactive communication skills.

Abstract

We present an evaluation framework for interactive dialogue assessment in the context of English as a Second Language (ESL) speakers. Our framework collects dialogue-level interactivity labels (e.g., topic management; 4 labels in total) and micro-level span features (e.g., backchannels; 17 features in total). Given our annotated data, we study how the micro-level features influence the (higher level) interactivity quality of ESL dialogues by constructing various machine learning-based models. Our results demonstrate that certain micro-level features strongly correlate with interactivity quality, like reference word (e.g., she, her, he), revealing new insights about the interaction between higher-level dialogue quality and lower-level linguistic signals. Our framework also provides a means to assess ESL communication, which is useful for language assessment.
Paper Structure (22 sections, 3 equations, 4 figures, 7 tables)

This paper contains 22 sections, 3 equations, 4 figures, 7 tables.

Figures (4)

  • Figure 1: An example of an annotated dialogue with dialogue-level interactivity labels and micro-level features
  • Figure 2: Our proposed evaluation framework has dialogue-level interactivity labels and micro-level features targeting interaction and engagement.
  • Figure 3: Annotation tool Demo
  • Figure 4: Hierarchical Label Assignment Demo