Table of Contents
Fetching ...

Keypoint Semantic Integration for Improved Feature Matching in Outdoor Agricultural Environments

Rajitha de Silva, Jonathan Cox, Marija Popovic, Cesar Cadena, Cyrill Stachniss, Riccardo Polvara

TL;DR

This work addresses perceptual aliasing in outdoor vineyard perception by embedding semantic context into keypoint descriptors (KSI). A modular pipeline combines semantic-instance embeddings from panoptic masks with standard descriptors, enabling a single-pass, heterogeneous descriptor matching that improves robustness for relative pose estimation and long-term visual localisation across seasonal changes. The SemanticBLT dataset supports training segmentation models for vineyard objects, and comprehensive ablations justify design choices such as addition-based semantic integration, selective normalisation, and heterogeneous matching. Findings show consistent improvements in semantically meaningful regions and practical runtime on both desktop and embedded hardware, highlighting substantial impact for vineyard robotics and similar repetitive-domain environments.

Abstract

Robust robot navigation in outdoor environments requires accurate perception systems capable of handling visual challenges such as repetitive structures and changing appearances. Visual feature matching is crucial to vision-based pipelines but remains particularly challenging in natural outdoor settings due to perceptual aliasing. We address this issue in vineyards, where repetitive vine trunks and other natural elements generate ambiguous descriptors that hinder reliable feature matching. We hypothesise that semantic information tied to keypoint positions can alleviate perceptual aliasing by enhancing keypoint descriptor distinctiveness. To this end, we introduce a keypoint semantic integration technique that improves the descriptors in semantically meaningful regions within the image, enabling more accurate differentiation even among visually similar local features. We validate this approach in two vineyard perception tasks: (i) relative pose estimation and (ii) visual localisation. Across all tested keypoint types and descriptors, our method improves matching accuracy by 12.6%, demonstrating its effectiveness over multiple months in challenging vineyard conditions.

Keypoint Semantic Integration for Improved Feature Matching in Outdoor Agricultural Environments

TL;DR

This work addresses perceptual aliasing in outdoor vineyard perception by embedding semantic context into keypoint descriptors (KSI). A modular pipeline combines semantic-instance embeddings from panoptic masks with standard descriptors, enabling a single-pass, heterogeneous descriptor matching that improves robustness for relative pose estimation and long-term visual localisation across seasonal changes. The SemanticBLT dataset supports training segmentation models for vineyard objects, and comprehensive ablations justify design choices such as addition-based semantic integration, selective normalisation, and heterogeneous matching. Findings show consistent improvements in semantically meaningful regions and practical runtime on both desktop and embedded hardware, highlighting substantial impact for vineyard robotics and similar repetitive-domain environments.

Abstract

Robust robot navigation in outdoor environments requires accurate perception systems capable of handling visual challenges such as repetitive structures and changing appearances. Visual feature matching is crucial to vision-based pipelines but remains particularly challenging in natural outdoor settings due to perceptual aliasing. We address this issue in vineyards, where repetitive vine trunks and other natural elements generate ambiguous descriptors that hinder reliable feature matching. We hypothesise that semantic information tied to keypoint positions can alleviate perceptual aliasing by enhancing keypoint descriptor distinctiveness. To this end, we introduce a keypoint semantic integration technique that improves the descriptors in semantically meaningful regions within the image, enabling more accurate differentiation even among visually similar local features. We validate this approach in two vineyard perception tasks: (i) relative pose estimation and (ii) visual localisation. Across all tested keypoint types and descriptors, our method improves matching accuracy by 12.6%, demonstrating its effectiveness over multiple months in challenging vineyard conditions.

Paper Structure

This paper contains 19 sections, 3 equations, 4 figures, 8 tables.

Figures (4)

  • Figure 1: KSI enhances the keypoint descriptors with corresponding semantic instances to minimise aliasing during keypoint matching in vineyards. Keypoint matches on semantic masks are visualised here across March and June.
  • Figure 2: Overview of proposed KSI framework: Input semantic instances $\mathcal{S_I}$ and the set of keypoints $\mathcal{K_I}$ are extracted from each image in the input image pair $\mathcal{I}$. The descriptors of keypoints on a semantic instance are combined with the corresponding semantic embedding to generate semantically enriched keypoint descriptors while the background descriptors are left unaltered. The semantically enriched descriptors and the background descriptors are matched together in a shared matcher.
  • Figure 3: Visualization of the vineyard loop path from which the data for experiments described in Section \ref{['sec:exp']} was colllected. The robot traverses row 1 in the forward direction, then returns along row 2, completing a closed-loop trajectory. Despite foliage occlusions, trunk and building classes remain visible across all months, rendering them ideal for enhancing keypoint descriptors. Buildings: Orange, Pipes: Purple, Poles: Red, Robot: Blue, Trunks: Cyan.
  • Figure 4: Determination of semantic instance correspondences. The ground truth rotation and translation are applied to the initial mask, producing a projected mask that is then compared with each instance mask in the subsequent frame. The instance with the highest IoU is selected as the corresponding expected mask.