End-to-end Sequence Labeling via Bi-directional LSTM-CNNs-CRF: A Reproducibility Study
Anirudh Ganesh, Jayavardhan Reddy
TL;DR
This reproducibility study validates Ma and Hovy's end-to-end BiLSTM-CNN-CRF architecture for sequence labeling, confirming near-original performance on NER and POS tasks and providing a complete PyTorch implementation. By dissecting each component—character-level CNNs, word-level BiLSTMs, and a CRF layer—the work demonstrates how morphology, context, and structured prediction jointly drive strong results. It also emphasizes the sensitivity of outcomes to concrete implementation details and offers extensive reproducibility resources, including hyperparameter analyses and a public codebase. The findings reinforce the model’s robustness as a baseline for sequence labeling and point toward integrating contemporary contextualized embeddings in future work.
Abstract
We present a reproducibility study of the state-of-the-art neural architecture for sequence labeling proposed by Ma and Hovy (2016)\cite{ma2016end}. The original BiLSTM-CNN-CRF model combines character-level representations via Convolutional Neural Networks (CNNs), word-level context modeling through Bi-directional Long Short-Term Memory networks (BiLSTMs), and structured prediction using Conditional Random Fields (CRFs). This end-to-end approach eliminates the need for hand-crafted features while achieving excellent performance on named entity recognition (NER) and part-of-speech (POS) tagging tasks. Our implementation successfully reproduces the key results, achieving 91.18\% F1-score on CoNLL-2003 NER and demonstrating the model's effectiveness across sequence labeling tasks. We provide a detailed analysis of the architecture components and release an open-source PyTorch implementation to facilitate further research.
