Table of Contents
Fetching ...

BAWSeg: A UAV Multispectral Benchmark for Barley Weed Segmentation

Haitian Wang, Xinyu Wang, Muhammad Ibrahim, Dustin Severtson, Ajmal Mian

TL;DR

VISA (Vegetation-Index and Spectral Attention), a two-stream segmentation network that decouples these cues and fuses them at native resolution, is proposed and BAWSeg, a four-year UAV multispectral dataset collected over commercial barley paddocks in Western Australia is introduced.

Abstract

Accurate weed mapping in cereal fields requires pixel-level segmentation from UAV imagery that remains reliable across fields, seasons, and illumination. Existing multispectral pipelines often depend on thresholded vegetation indices, which are brittle under radiometric drift and mixed crop--weed pixels, or on single-stream CNN and Transformer backbones that ingest stacked bands and indices, where radiance cues and normalized index cues interfere and reduce sensitivity to small weed clusters embedded in crop canopies. We propose VISA (Vegetation-Index and Spectral Attention), a two-stream segmentation network that decouples these cues and fuses them at native resolution. The radiance stream learns from calibrated five-band reflectance using residual spectral-spatial attention to preserve fine textures and row boundaries that are attenuated by ratio indices. The index stream operates on vegetation-index maps with windowed self-attention to model local structure efficiently, state-space layers to propagate field-scale context without quadratic attention cost, and Slot Attention to form stable region descriptors that improve discrimination of sparse weeds under canopy mixing. To support supervised training and deployment-oriented evaluation, we introduce BAWSeg, a four-year UAV multispectral dataset collected over commercial barley paddocks in Western Australia, providing radiometrically calibrated blue, green, red, red edge, and near-infrared orthomosaics, derived vegetation indices, and dense crop, weed, and other labels with leakage-free block splits. On BAWSeg, VISA achieves 75.6% mIoU and 63.5% weed IoU with 22.8M parameters, outperforming a multispectral SegFormer-B1 baseline by 1.2 mIoU and 1.9 weed IoU. Under cross-plot and cross-year protocols, VISA maintains 71.2% and 69.2% mIoU, respectively. The BAWSeg data, VISA code, and trained models will be released upon publication.

BAWSeg: A UAV Multispectral Benchmark for Barley Weed Segmentation

TL;DR

VISA (Vegetation-Index and Spectral Attention), a two-stream segmentation network that decouples these cues and fuses them at native resolution, is proposed and BAWSeg, a four-year UAV multispectral dataset collected over commercial barley paddocks in Western Australia is introduced.

Abstract

Accurate weed mapping in cereal fields requires pixel-level segmentation from UAV imagery that remains reliable across fields, seasons, and illumination. Existing multispectral pipelines often depend on thresholded vegetation indices, which are brittle under radiometric drift and mixed crop--weed pixels, or on single-stream CNN and Transformer backbones that ingest stacked bands and indices, where radiance cues and normalized index cues interfere and reduce sensitivity to small weed clusters embedded in crop canopies. We propose VISA (Vegetation-Index and Spectral Attention), a two-stream segmentation network that decouples these cues and fuses them at native resolution. The radiance stream learns from calibrated five-band reflectance using residual spectral-spatial attention to preserve fine textures and row boundaries that are attenuated by ratio indices. The index stream operates on vegetation-index maps with windowed self-attention to model local structure efficiently, state-space layers to propagate field-scale context without quadratic attention cost, and Slot Attention to form stable region descriptors that improve discrimination of sparse weeds under canopy mixing. To support supervised training and deployment-oriented evaluation, we introduce BAWSeg, a four-year UAV multispectral dataset collected over commercial barley paddocks in Western Australia, providing radiometrically calibrated blue, green, red, red edge, and near-infrared orthomosaics, derived vegetation indices, and dense crop, weed, and other labels with leakage-free block splits. On BAWSeg, VISA achieves 75.6% mIoU and 63.5% weed IoU with 22.8M parameters, outperforming a multispectral SegFormer-B1 baseline by 1.2 mIoU and 1.9 weed IoU. Under cross-plot and cross-year protocols, VISA maintains 71.2% and 69.2% mIoU, respectively. The BAWSeg data, VISA code, and trained models will be released upon publication.
Paper Structure (19 sections, 15 equations, 6 figures, 4 tables)

This paper contains 19 sections, 15 equations, 6 figures, 4 tables.

Figures (6)

  • Figure S1: Overview of UAV multispectral data collection and BAWSeg dataset construction. The pipeline includes flight acquisition with RTK and irradiance sensors, radiometric calibration, orthorectification, stitching, feature extraction, annotation, and resulting crop–weed–soil maps.
  • Figure S2: Planned flight paths for the Kondinin experimental fields. Left, E8 polygon and survey lines planned in GSPro. Right, E2 polygon with the same overlap and altitude parameters. The inset shows a representative interior tile used for quality control of row orientation and texture.
  • Figure S3: Example of keypoint inliers for cross band and inter frame alignment. Green circles denote SIFT keypoints and blue lines denote matches retained after RANSAC.
  • Figure S4: From left to right, RGB orthomosaic, a single-band reflectance map, and the hard label map. Green denotes crop, red denotes weed, and white denotes other.
  • Figure S5: Overview of the proposed pipeline. Bottom: SRAB radiance stream with residual attention and a U-Net decoder. Top: vegetation-index stream with windowed self-attention, Mamba blocks, Slot Attention with mean-slot broadcast, and a scale-aligned decoder. Right: fusion head that concatenates the two 64-channel feature maps at the native resolution to generate class logits.
  • ...and 1 more figures