Reliable Academic Conference Question Answering: A Study Based on Large Language Model
Zhiwei Huang, Juan Li, Long Jin, Junjie Wang, Mingchen Tu, Yin Hua, Zhiqiang Liu, Jiawei Meng, Wen Zhang
TL;DR
This work addresses the challenge of obtaining timely, accurate information about academic conferences by introducing ConferenceQA, a tree-structured, seven-conference benchmark, and STAR, a structure-aware retrieval method that leverages hierarchical data during retrieval. ConferenceQA is built via semi-automatic hierarchical transformations, role-based QA pair generation, and four-way difficulty classification, followed by rigorous validation from multiple assessors. STAR enhances QA performance by generating descriptive, structure-informed path representations and performing retrieval over these descriptions, outperforming traditional path retrieval across diverse LLMs and retrievers. The dataset and method collectively advance robust, knowledge-grounded conference QA, enabling better up-to-date information access for researchers and practitioners.
Abstract
As the development of academic conferences fosters global scholarly communication, researchers consistently need to obtain accurate and up-to-date information about academic conferences. Since the information is scattered, using an intelligent question-answering system to efficiently handle researchers' queries and ensure awareness of the latest advancements is necessary. Recently, Large Language Models (LLMs) have demonstrated impressive capabilities in question answering, and have been enhanced by retrieving external knowledge to deal with outdated knowledge. However, these methods fail to work due to the lack of the latest conference knowledge. To address this challenge, we develop the ConferenceQA dataset, consisting of seven diverse academic conferences. Specifically, for each conference, we first organize academic conference data in a tree-structured format through a semi-automated method. Then we annotate question-answer pairs and classify the pairs into four different types to better distinguish their difficulty. With the constructed dataset, we further propose a novel method STAR (STructure-Aware Retrieval) to improve the question-answering abilities of LLMs, leveraging inherent structural information during the retrieval process. Experimental results on the ConferenceQA dataset show the effectiveness of our retrieval method. The dataset and code are available at https://github.com/zjukg/ConferenceQA.
