Table of Contents
Fetching ...

AgriGPT: a Large Language Model Ecosystem for Agriculture

Bo Yang, Yu Zhang, Lanfei Feng, Yunkui Chen, Jianyu Zhang, Xiao Xu, Nueraili Aierken, Yurui Li, Yuxuan Chen, Guijun Yang, Yong He, Runhe Huang, Shijian Li

TL;DR

AgriGPT presents a domain-specific LLM ecosystem for agriculture to address data scarcity, grounding, and evaluation gaps. It introduces a scalable multi-agent data engine to create Agri-342K, a LoRA-based continual pretraining and supervised fine-tuning pipeline, and a Tri-RAG retrieval-augmented framework for factual multi-hop reasoning. The AgriBench-13K benchmark enables standardized, multi-task evaluation, while multilingual adaptation demonstrates cross-lingual transfer. The open-source release of models, data, and benchmarks supports equitable deployment, particularly in underserved regions, and establishes a generalizable framework for domain-specific LLMs in real-world sectors.

Abstract

Despite the rapid progress of Large Language Models (LLMs), their application in agriculture remains limited due to the lack of domain-specific models, curated datasets, and robust evaluation frameworks. To address these challenges, we propose AgriGPT, a domain-specialized LLM ecosystem for agricultural usage. At its core, we design a multi-agent scalable data engine that systematically compiles credible data sources into Agri-342K, a high-quality, standardized question-answer (QA) dataset. Trained on this dataset, AgriGPT supports a broad range of agricultural stakeholders, from practitioners to policy-makers. To enhance factual grounding, we employ Tri-RAG, a three-channel Retrieval-Augmented Generation framework combining dense retrieval, sparse retrieval, and multi-hop knowledge graph reasoning, thereby improving the LLM's reasoning reliability. For comprehensive evaluation, we introduce AgriBench-13K, a benchmark suite comprising 13 tasks with varying types and complexities. Experiments demonstrate that AgriGPT significantly outperforms general-purpose LLMs on both domain adaptation and reasoning. Beyond the model itself, AgriGPT represents a modular and extensible LLM ecosystem for agriculture, comprising structured data construction, retrieval-enhanced generation, and domain-specific evaluation. This work provides a generalizable framework for developing scientific and industry-specialized LLMs. All models, datasets, and code will be released to empower agricultural communities, especially in underserved regions, and to promote open, impactful research.

AgriGPT: a Large Language Model Ecosystem for Agriculture

TL;DR

AgriGPT presents a domain-specific LLM ecosystem for agriculture to address data scarcity, grounding, and evaluation gaps. It introduces a scalable multi-agent data engine to create Agri-342K, a LoRA-based continual pretraining and supervised fine-tuning pipeline, and a Tri-RAG retrieval-augmented framework for factual multi-hop reasoning. The AgriBench-13K benchmark enables standardized, multi-task evaluation, while multilingual adaptation demonstrates cross-lingual transfer. The open-source release of models, data, and benchmarks supports equitable deployment, particularly in underserved regions, and establishes a generalizable framework for domain-specific LLMs in real-world sectors.

Abstract

Despite the rapid progress of Large Language Models (LLMs), their application in agriculture remains limited due to the lack of domain-specific models, curated datasets, and robust evaluation frameworks. To address these challenges, we propose AgriGPT, a domain-specialized LLM ecosystem for agricultural usage. At its core, we design a multi-agent scalable data engine that systematically compiles credible data sources into Agri-342K, a high-quality, standardized question-answer (QA) dataset. Trained on this dataset, AgriGPT supports a broad range of agricultural stakeholders, from practitioners to policy-makers. To enhance factual grounding, we employ Tri-RAG, a three-channel Retrieval-Augmented Generation framework combining dense retrieval, sparse retrieval, and multi-hop knowledge graph reasoning, thereby improving the LLM's reasoning reliability. For comprehensive evaluation, we introduce AgriBench-13K, a benchmark suite comprising 13 tasks with varying types and complexities. Experiments demonstrate that AgriGPT significantly outperforms general-purpose LLMs on both domain adaptation and reasoning. Beyond the model itself, AgriGPT represents a modular and extensible LLM ecosystem for agriculture, comprising structured data construction, retrieval-enhanced generation, and domain-specific evaluation. This work provides a generalizable framework for developing scientific and industry-specialized LLMs. All models, datasets, and code will be released to empower agricultural communities, especially in underserved regions, and to promote open, impactful research.

Paper Structure

This paper contains 14 sections, 4 figures, 6 tables.

Figures (4)

  • Figure 1: The AgriGPT Ecosystem: a). AgriGPT Data Engine: the 3 pipelines to construct Agri-342K dataset b). illustrating the model training workflow (continual pretraining and supervised fine-tuning) c). Tri-RAG inference and ablation: highlighting multi-path gains over single-path baselines d). the Agri-342K dataset with a broad topic spectrum e). the AgriBench-13K benchmark design
  • Figure 2: Workflow: Multi-Agent Framework for Ensuring Instruction Data Quality
  • Figure 3: LLM-Based Evaluation of AgriGPT and Other Models across 13 Tasks of AgriBench-13K and Total Score
  • Figure 4: Performance comparison of AgriGPT and other models on AgriBench-13K