A Systematic Comparison of Syntactic Representations of Dependency Parsing
Guillaume Wisniewski, Ophélie Lacroix
TL;DR
This paper addresses cross-language differences in dependency parsing performance induced by syntactic annotation schemes. It proposes seven transformation rules to convert UD representations into alternative structures and evaluates them across $38$ languages using $266$ transformed corpora (with $44$ identical cases). Parsers are trained in a transition-based arc-eager setup with a dynamic oracle and evaluated with Unlabeled Attachment Score (UAS). Results show that the UD scheme generally yields higher $UAS$ than transformed representations (average difference $0.66$, up to $8.1$), though some languages benefit from certain transformations; learnability metrics fail to reliably predict which representation will be best. The findings underscore the practical impact of annotation choices on cross-language parsing and highlight the limitations of current predictability criteria for representation selection.
Abstract
We compare the performance of a transition-based parser in regards to different annotation schemes. We pro-pose to convert some specific syntactic constructions observed in the universal dependency treebanks into a so-called more standard representation and to evaluate parsing performances over all the languages of the project. We show that the ``standard'' constructions do not lead systematically to better parsing performance and that the scores vary considerably according to the languages.
