Table of Contents
Fetching ...

Interpretable dimensions support an effect of agentivity and telicity on split intransitivity

Eva Neu, Brian Dillon, Katrin Erk

TL;DR

The paper re-examines the semantic correlates of split intransitivity by constructing interpretable agentivity and telicity axes in word Embedding space from seed words and testing their predictive power against Kim et al.'s syntactic judgments. It introduces a ranking-loss method to fit axes, compares seed-based and rating-based predictors using a mixed-effects Bayesian ordinal framework with leave-one-out validation, and demonstrates that seed-based dimensions better predict unergativity/unaccusativity than human ratings, while telicity proves more problematic. The findings suggest that agentivity signals align with broader animacy features rather than purely intentional action, while telicity ratings suffer from reliability issues, highlighting methodological advantages of seed-based axes and the potential of token-level analyses for future work. Overall, the work advances a semantics-to-syntax perspective by combining interpretable embeddings with human judgments to illuminate properties hard to rate directly, and suggests token-aware approaches to refine these insights further.

Abstract

Intransitive verbs fall into two different syntactic classes, unergatives and unaccusatives. It has long been argued that verbs describing an agentive action are more likely to appear in an unergative syntax, and those describing a telic event to appear in an unaccusative syntax. However, recent work by Kim et al. (2024) found that human ratings for agentivity and telicity were a poor predictor of the syntactic behavior of intransitives. Here we revisit this question using interpretable dimensions, computed from seed words on opposite poles of the agentive and telic scales. Our findings support the link between unergativity/unaccusativity and agentivity/telicity, and demonstrate that using interpretable dimensions in conjunction with human judgments can offer valuable evidence for semantic properties that are not easily evaluated in rating tasks.

Interpretable dimensions support an effect of agentivity and telicity on split intransitivity

TL;DR

The paper re-examines the semantic correlates of split intransitivity by constructing interpretable agentivity and telicity axes in word Embedding space from seed words and testing their predictive power against Kim et al.'s syntactic judgments. It introduces a ranking-loss method to fit axes, compares seed-based and rating-based predictors using a mixed-effects Bayesian ordinal framework with leave-one-out validation, and demonstrates that seed-based dimensions better predict unergativity/unaccusativity than human ratings, while telicity proves more problematic. The findings suggest that agentivity signals align with broader animacy features rather than purely intentional action, while telicity ratings suffer from reliability issues, highlighting methodological advantages of seed-based axes and the potential of token-level analyses for future work. Overall, the work advances a semantics-to-syntax perspective by combining interpretable embeddings with human judgments to illuminate properties hard to rate directly, and suggests token-aware approaches to refine these insights further.

Abstract

Intransitive verbs fall into two different syntactic classes, unergatives and unaccusatives. It has long been argued that verbs describing an agentive action are more likely to appear in an unergative syntax, and those describing a telic event to appear in an unaccusative syntax. However, recent work by Kim et al. (2024) found that human ratings for agentivity and telicity were a poor predictor of the syntactic behavior of intransitives. Here we revisit this question using interpretable dimensions, computed from seed words on opposite poles of the agentive and telic scales. Our findings support the link between unergativity/unaccusativity and agentivity/telicity, and demonstrate that using interpretable dimensions in conjunction with human judgments can offer valuable evidence for semantic properties that are not easily evaluated in rating tasks.

Paper Structure

This paper contains 10 sections, 2 equations, 9 tables.