Large Language Models for Education and Research: An Empirical and User Survey-based Analysis

Md Mostafizer Rahman; Ariful Islam Shiplu; Md Faizul Ibne Amin; Yutaka Watanobe; Lu Peng

Large Language Models for Education and Research: An Empirical and User Survey-based Analysis

Md Mostafizer Rahman, Ariful Islam Shiplu, Md Faizul Ibne Amin, Yutaka Watanobe, Lu Peng

TL;DR

<3-5 sentence high-level summary>Addressing how large language models can support education and research, the paper jointly analyzes model architectures, benchmarks, and user perceptions. It combines background technology analysis with empirical experiments across math, NLP, science, medicine, and programming, plus a real-world user survey of students and educators. Key findings show ChatGPT excels in fluency and general reasoning, while DeepSeek delivers superior coding performance and efficient knowledge retrieval, with strong validity in scientific domains and medical outputs. The work highlights opportunities and risks of adoption and argues for responsible integration and governance in educational and research ecosystems.

Abstract

Pretrained Large Language Models (LLMs) have achieved remarkable success across diverse domains, with education and research emerging as particularly impactful areas. Among current state-of-the-art LLMs, ChatGPT and DeepSeek exhibit strong capabilities in mathematics, science, medicine, literature, and programming. In this study, we present a comprehensive evaluation of these two LLMs through background technology analysis, empirical experiments, and a real-world user survey. The evaluation explores trade-offs among model accuracy, computational efficiency, and user experience in educational and research affairs. We benchmarked these LLMs performance in text generation, programming, and specialized problem-solving. Experimental results show that ChatGPT excels in general language understanding and text generation, while DeepSeek demonstrates superior performance in programming tasks due to its efficiency-focused design. Moreover, both models deliver medically accurate diagnostic outputs and effectively solve complex mathematical problems. Complementing these quantitative findings, a survey of students, educators, and researchers highlights the practical benefits and limitations of these models, offering deeper insights into their role in advancing education and research.

Large Language Models for Education and Research: An Empirical and User Survey-based Analysis

TL;DR

Abstract

Large Language Models for Education and Research: An Empirical and User Survey-based Analysis

TL;DR

Abstract

Paper Structure

Table of Contents

Figures (5)