AgRowStitch: A High-fidelity Image Stitching Pipeline for Ground-based Agricultural Images
Isaac Kazuo Uyehara, Heesup Yun, Earl Ranario, Mason Earles
TL;DR
The paper tackles the difficulty of stitching ground-based agricultural images taken close to crops, where drift and parallax hinder traditional mosaic methods. It introduces an open-source, row-focused pipeline that stitches images in small batches with constraints on camera motion, using SuperPoint for features and LightGlue for matching, followed by OpenCV-based refinement and straightening. The approach achieves leaf-scale mosaics with roughly 20 cm mean absolute error over a 72 m row across three datasets, enabling coarse georeferencing without GPS or specialized hardware. This has practical impact for agronomists and plant phenotyping, providing accessible, high-resolution row mosaics when precise positioning data are unavailable.
Abstract
Agricultural imaging often requires individual images to be stitched together into a final mosaic for analysis. However, agricultural images can be particularly challenging to stitch because feature matching across images is difficult due to repeated textures, plants are non-planar, and mosaics built from many images can accumulate errors that cause drift. Although these issues can be mitigated by using georeferenced images or taking images at high altitude, there is no general solution for images taken close to the crop. To address this, we created a user-friendly and open source pipeline for stitching ground-based images of a linear row of crops that does not rely on additional data. First, we use SuperPoint and LightGlue to extract and match features within small batches of images. Then we stitch the images in each batch in series while imposing constraints on the camera movement. After straightening and rescaling each batch mosaic, all batch mosaics are stitched together in series and then straightened into a final mosaic. We tested the pipeline on images collected along 72 m long rows of crops using two different agricultural robots and a camera manually carried over the row. In all three cases, the pipeline produced high-quality mosaics that could be used to georeference real world positions with a mean absolute error of 20 cm. This approach provides accessible leaf-scale stitching to users who need to coarsely georeference positions within a row, but do not have access to accurate positional data or sophisticated imaging systems.
