Table of Contents
Fetching ...

Universal Bovine Identification via Depth Data and Deep Metric Learning

Asheesh Sharma, Lucy Randewich, William Andrew, Sion Hannuna, Neill Campbell, Siobhan Mullan, Andrew W. Dowsey, Melvyn Smith, Mark Hansen, Tilo Burghardt

TL;DR

This work tackles scalable, non-contact identification of individual cattle using depth data from a dorsal view. It introduces a deep metric-learning framework with two backbones (ResNet-50 for depth maps and PointNet for point clouds) and a kNN classifier on embeddings, evaluated on the CowDepth2023 dataset of 21,490 images from 99 cows. The key contribution is showing depth-based biometrics can match coat-pattern-based methods and support open-set enrollment without retraining, aided by explicit open-set evaluation and interpretability analyses (Grad-CAM/PCSM). This approach advances precision livestock farming by enabling robust, scalable identification across breeds and conditions with minimal on-farm maintenance.

Abstract

This paper proposes and evaluates, for the first time, a top-down (dorsal view), depth-only deep learning system for accurately identifying individual cattle and provides associated code, datasets, and training weights for immediate reproducibility. An increase in herd size skews the cow-to-human ratio at the farm and makes the manual monitoring of individuals more challenging. Therefore, real-time cattle identification is essential for the farms and a crucial step towards precision livestock farming. Underpinned by our previous work, this paper introduces a deep-metric learning method for cattle identification using depth data from an off-the-shelf 3D camera. The method relies on CNN and MLP backbones that learn well-generalised embedding spaces from the body shape to differentiate individuals -- requiring neither species-specific coat patterns nor close-up muzzle prints for operation. The network embeddings are clustered using a simple algorithm such as $k$-NN for highly accurate identification, thus eliminating the need to retrain the network for enrolling new individuals. We evaluate two backbone architectures, ResNet, as previously used to identify Holstein Friesians using RGB images, and PointNet, which is specialised to operate on 3D point clouds. We also present CowDepth2023, a new dataset containing 21,490 synchronised colour-depth image pairs of 99 cows, to evaluate the backbones. Both ResNet and PointNet architectures, which consume depth maps and point clouds, respectively, led to high accuracy that is on par with the coat pattern-based backbone.

Universal Bovine Identification via Depth Data and Deep Metric Learning

TL;DR

This work tackles scalable, non-contact identification of individual cattle using depth data from a dorsal view. It introduces a deep metric-learning framework with two backbones (ResNet-50 for depth maps and PointNet for point clouds) and a kNN classifier on embeddings, evaluated on the CowDepth2023 dataset of 21,490 images from 99 cows. The key contribution is showing depth-based biometrics can match coat-pattern-based methods and support open-set enrollment without retraining, aided by explicit open-set evaluation and interpretability analyses (Grad-CAM/PCSM). This approach advances precision livestock farming by enabling robust, scalable identification across breeds and conditions with minimal on-farm maintenance.

Abstract

This paper proposes and evaluates, for the first time, a top-down (dorsal view), depth-only deep learning system for accurately identifying individual cattle and provides associated code, datasets, and training weights for immediate reproducibility. An increase in herd size skews the cow-to-human ratio at the farm and makes the manual monitoring of individuals more challenging. Therefore, real-time cattle identification is essential for the farms and a crucial step towards precision livestock farming. Underpinned by our previous work, this paper introduces a deep-metric learning method for cattle identification using depth data from an off-the-shelf 3D camera. The method relies on CNN and MLP backbones that learn well-generalised embedding spaces from the body shape to differentiate individuals -- requiring neither species-specific coat patterns nor close-up muzzle prints for operation. The network embeddings are clustered using a simple algorithm such as -NN for highly accurate identification, thus eliminating the need to retrain the network for enrolling new individuals. We evaluate two backbone architectures, ResNet, as previously used to identify Holstein Friesians using RGB images, and PointNet, which is specialised to operate on 3D point clouds. We also present CowDepth2023, a new dataset containing 21,490 synchronised colour-depth image pairs of 99 cows, to evaluate the backbones. Both ResNet and PointNet architectures, which consume depth maps and point clouds, respectively, led to high accuracy that is on par with the coat pattern-based backbone.
Paper Structure (15 sections, 9 equations, 13 figures, 3 tables)

This paper contains 15 sections, 9 equations, 13 figures, 3 tables.

Figures (13)

  • Figure 1: Traditional and emerging identification techniques: Depth is a universally applicable, contact-free biometric for cattle identification.a) Hot/cold branding, tattooing, and RFID ear/collar tags are traditional methods for identifying cattle. b) Recent research into techniques incorporating biometrics like face, iris, and coat patterns has been made possible by breakthroughs in deep neural networks. Our method employs 3D surface (depth) features for contact-free cattle identification.
  • Figure 2: Open-set identification with depth data via deep metric learning.a) Examples of top-down RGB+Depth images in our dataset, pairs correspond to the same animal, column-wise. The inset shows the hook-pin-thurl region of a cow. b) The open-set validation excludes individuals completely, while the closed-set excludes some portion of the dataset for all individuals. c) A deep learning model generates embeddings clustered using kNN for identification.
  • Figure 3: The camera's point of view and post-processing of depth images.a) The progression of one cow travelling toward the milking parlour (from left to right) is displayed as a sequence of images and the corresponding time-synchronised depth maps. b) The original depth map is segmented by first thresholding and background subtraction, followed by "blob" detection and cropping.
  • Figure 4: Analogy for interpreting deptha) Pushing an object against the pin impression device creates depressions at the contact plane; depth maps represent a similar property. b) The depth camera measures the distance between itself and the object for each pixel. c) Point clouds are generated using the depth map and the camera's intrinsic properties that associate individual points to the measured depth.
  • Figure 5: Preparing the dataset for open-set analysis. First, we randomly split the dataset into train and test sets in a 7:3 ratio. a) Then we discard n neighbouring images from the train set for every test image. b) As the value of n is increased from 2 to 10, more images are discarded from the train set, resulting in some cows with no images left. These cows in the test set are considered 'unknowns', i.e. part of the open-set, enrolled with kNN, post-training.
  • ...and 8 more figures