Survey of Natural Language Processing for Education: Taxonomy, Systematic Review, and Future Trends
Yunshi Lan, Xinyuan Li, Hanyue Du, Xuesong Lu, Ming Gao, Weining Qian, Aoying Zhou
TL;DR
The paper addresses how NLP can transform education by proposing a taxonomy of four core tasks (QA, QC, AA, EC) and eight sub tasks, surveying relevant datasets, methods, and demonstration systems. It emphasizes the role of large language models and multi modal reasoning in Textbook QA and Math Word Problems, and details the generation, assessment, and correction tasks that support teaching and learning. The work highlights data sources, benchmarks, and practical demos, while pointing to future directions such as cross subject and language generalization, deployed LLM based education systems, adaptive learning, interpretability, and ethics. Overall, it provides a comprehensive, task oriented map of educational NLP that can guide researchers and practitioners in building effective and responsible AI assisted education tools.
Abstract
Natural Language Processing (NLP) aims to analyze text or speech via techniques in the computer science field. It serves applications in the domains of healthcare, commerce, education, and so on. Particularly, NLP has been widely applied to the education domain and its applications have enormous potential to help teaching and learning. In this survey, we review recent advances in NLP with a focus on solving problems relevant to the education domain. In detail, we begin with introducing the related background and the real-world scenarios in education to which NLP techniques could contribute. Then, we present a taxonomy of NLP in the education domain and highlight typical NLP applications including question answering, question construction, automated assessment, and error correction. Next, we illustrate the task definition, challenges, and corresponding cutting-edge techniques based on the above taxonomy. In particular, LLM-involved methods are included for discussion due to the wide usage of LLMs in diverse NLP applications. After that, we showcase some off-the-shelf demonstrations in this domain, which are designed for educators or researchers. At last, we conclude with five promising directions for future research, including generalization over subjects and languages, deployed LLM-based systems for education, adaptive learning for teaching and learning, interpretability for education, and ethical consideration of NLP techniques. We organize all relevant datasets and papers in the open-available Github Link for better review https://github.com/LiXinyuan1015/NLP-for-Education.
