FAQ-based Question Answering via Word Alignment
Zhiguo Wang, Abraham Ittycheriah
TL;DR
The paper addresses FAQ-based QA by modeling question similarity through word alignment and a neural predictor that aggregates alignment-based features. It introduces a bootstrap-based sparse feature selection method and a learning-to-rank objective to optimize ranking quality, achieving strong performance across English, Spanish and Japanese FAQs and on the TREC answer sentence selection task. Key contributions include the word-alignment feature design with dense and sparse features, and a ranking loss that directly targets top-1 accuracy. The results demonstrate substantial improvements over baselines and state-of-the-art methods, suggesting practical impact for multilingual FAQ systems and ranking-based QA tasks.
Abstract
In this paper, we propose a novel word-alignment-based method to solve the FAQ-based question answering task. First, we employ a neural network model to calculate question similarity, where the word alignment between two questions is used for extracting features. Second, we design a bootstrap-based feature extraction method to extract a small set of effective lexical features. Third, we propose a learning-to-rank algorithm to train parameters more suitable for the ranking tasks. Experimental results, conducted on three languages (English, Spanish and Japanese), demonstrate that the question similarity model is more effective than baseline systems, the sparse features bring 5% improvements on top-1 accuracy, and the learning-to-rank algorithm works significantly better than the traditional method. We further evaluate our method on the answer sentence selection task. Our method outperforms all the previous systems on the standard TREC data set.
