Trajectory Planning for UAV-Based Smart Farming Using Imitation-Based Triple Deep Q-Learning

Wencan Mao; Quanxi Zhou; Tomas Couso Coddou; Manabu Tsukada; Yunling Liu; Yusheng Ji

Trajectory Planning for UAV-Based Smart Farming Using Imitation-Based Triple Deep Q-Learning

Wencan Mao, Quanxi Zhou, Tomas Couso Coddou, Manabu Tsukada, Yunling Liu, Yusheng Ji

TL;DR

This work addresses UAV trajectory planning for smart farming under environmental uncertainty and partial observability by formulating the problem as an MDP and solving it with a novel imitation-based triple deep Q-network (ITDQN). ITDQN combines an elite imitation mechanism to reduce exploration costs with a mediator Q-network layered over a DDQN to enhance training speed, stability, and performance. Through synthetic simulations and real-world demonstrations, ITDQN consistently outperforms DQN, DDQN, and heuristic baselines, achieving higher weed recognition and data collection rates within UAV energy constraints. The approach enables scalable, multi-UAV coordination for weed management and sensor data collection, offering practical benefits for precision agriculture.

Abstract

Unmanned aerial vehicles (UAVs) have emerged as a promising auxiliary platform for smart agriculture, capable of simultaneously performing weed detection, recognition, and data collection from wireless sensors. However, trajectory planning for UAV-based smart agriculture is challenging due to the high uncertainty of the environment, partial observations, and limited battery capacity of UAVs. To address these issues, we formulate the trajectory planning problem as a Markov decision process (MDP) and leverage multi-agent reinforcement learning (MARL) to solve it. Furthermore, we propose a novel imitation-based triple deep Q-network (ITDQN) algorithm, which employs an elite imitation mechanism to reduce exploration costs and utilizes a mediator Q-network over a double deep Q-network (DDQN) to accelerate and stabilize training and improve performance. Experimental results in both simulated and real-world environments demonstrate the effectiveness of our solution. Moreover, our proposed ITDQN outperforms DDQN by 4.43\% in weed recognition rate and 6.94\% in data collection rate.

Trajectory Planning for UAV-Based Smart Farming Using Imitation-Based Triple Deep Q-Learning

TL;DR

Abstract

Trajectory Planning for UAV-Based Smart Farming Using Imitation-Based Triple Deep Q-Learning

TL;DR

Abstract

Paper Structure

Table of Contents

Figures (4)