Table of Contents
Fetching ...

ASC analyzer: A Python package for measuring argument structure construction usage in English texts

Hakyung Sung, Kristopher Kyle

TL;DR

The paper introduces ASC analyzer, an open-source Python package that automates ASC tagging via a RoBERTa-based tagger and computes 50 interpretable indices to capture ASC diversity, proportion, frequency, and ASC–verb association strength. Using a large sample of ESL writing from the ELLIPSE corpus, the authors demonstrate that ASC-based indices relate to L2 writing proficiency and offer incremental explanatory power beyond traditional syntactic complexity and existing lexicogrammatical measures. The approach combines a robust ASC tagger with norm-based frequency and association metrics drawn from EnCOW and SUBTLEX-US, enabling scalable, corpus-level analysis of construction use in L2 texts. The results indicate ASC indices contribute meaningful variance in writing scores, supporting their value as complementary indicators of constructional usage in language proficiency research, while also acknowledging limitations in tagger biases and the scope of reference norms.

Abstract

Argument structure constructions (ASCs) offer a theoretically grounded lens for analyzing second language (L2) proficiency, yet scalable and systematic tools for measuring their usage remain limited. This paper introduces the ASC analyzer, a publicly available Python package designed to address this gap. The analyzer automatically tags ASCs and computes 50 indices that capture diversity, proportion, frequency, and ASC-verb lemma association strength. To demonstrate its utility, we conduct both bivariate and multivariate analyses that examine the relationship between ASC-based indices and L2 writing scores.

ASC analyzer: A Python package for measuring argument structure construction usage in English texts

TL;DR

The paper introduces ASC analyzer, an open-source Python package that automates ASC tagging via a RoBERTa-based tagger and computes 50 interpretable indices to capture ASC diversity, proportion, frequency, and ASC–verb association strength. Using a large sample of ESL writing from the ELLIPSE corpus, the authors demonstrate that ASC-based indices relate to L2 writing proficiency and offer incremental explanatory power beyond traditional syntactic complexity and existing lexicogrammatical measures. The approach combines a robust ASC tagger with norm-based frequency and association metrics drawn from EnCOW and SUBTLEX-US, enabling scalable, corpus-level analysis of construction use in L2 texts. The results indicate ASC indices contribute meaningful variance in writing scores, supporting their value as complementary indicators of constructional usage in language proficiency research, while also acknowledging limitations in tagger biases and the scope of reference norms.

Abstract

Argument structure constructions (ASCs) offer a theoretically grounded lens for analyzing second language (L2) proficiency, yet scalable and systematic tools for measuring their usage remain limited. This paper introduces the ASC analyzer, a publicly available Python package designed to address this gap. The analyzer automatically tags ASCs and computes 50 indices that capture diversity, proportion, frequency, and ASC-verb lemma association strength. To demonstrate its utility, we conduct both bivariate and multivariate analyses that examine the relationship between ASC-based indices and L2 writing scores.

Paper Structure

This paper contains 27 sections, 8 equations, 1 figure, 3 tables.

Figures (1)

  • Figure 1: High-level architecture of ASC analyzer