Syntactic constituency parsing is a fundamental problem in natural language
processing and has been the subject of intensive research and engineering for
decades. As a result, the most accurate parsers are domain specific, complex,
and inefficient. In this paper we show that the domain agnostic
attention-enhanced sequence-to-sequence model achieves state-of-the-art results
on the most widely used syntactic constituency parsing dataset, when trained on
a large synthetic corpus that was annotated using existing parsers. It also
matches the performance of standard parsers when trained only on a small
human-annotated dataset, which shows that this model is highly data-efficient,
in contrast to sequence-to-sequence models without the attention mechanism. Our
parser is also fast, processing over a hundred sentences per second with an
unoptimized CPU implementation.