The Emergence of Chunking Structures with Hierarchical RNN
Zijun Wu, Anup Anand Deshmukh, Yongkang Wu, Jimmy Lin, Lili Mou
TL;DR
The paper tackles unsupervised chunking by introducing a Hierarchical RNN (HRNN) that explicitly models word-to-chunk and chunk-to-sentence composition via a trainable gating mechanism. A two-stage training regime uses an unsupervised Compound PCFG to induce chunk labels for pretraining, followed by finetuning on downstream text-generation tasks to refine chunk representations. Empirical results show significant gains over baselines in unsupervised chunking and improved transfer to downstream tasks, with summarization-driven pretraining delivering the largest gains. A key finding is the transient emergence of linguistic structure during finetuning, suggesting that chunk-like representations serve as a useful inductive bias early in training but may be discarded as the model optimizes downstream performance. The work advances unsupervised syntactic structure discovery and opens avenues for multilingual extension and deeper linguistic analysis.
Abstract
In Natural Language Processing (NLP), predicting linguistic structures, such as parsing and chunking, has mostly relied on manual annotations of syntactic structures. This paper introduces an unsupervised approach to chunking, a syntactic task that involves grouping words in a non-hierarchical manner. We present a Hierarchical Recurrent Neural Network (HRNN) designed to model word-to-chunk and chunk-to-sentence compositions. Our approach involves a two-stage training process: pretraining with an unsupervised parser and finetuning on downstream NLP tasks. Experiments on multiple datasets reveal a notable improvement of unsupervised chunking performance in both pretraining and finetuning stages. Interestingly, we observe that the emergence of the chunking structure is transient during the neural model's downstream-task training. This study contributes to the advancement of unsupervised syntactic structure discovery and opens avenues for further research in linguistic theory.
