Table of Contents
Fetching ...

A Comparison of Deep Learning and Established Methods for Calf Behaviour Monitoring

Oshana Dissanayake, Lucile Riaboff, Sarah E. McPherson, Emer Kennedy, Pádraig Cunningham

TL;DR

This study tackles automated calf welfare monitoring by classifying calf behaviours from collar accelerometer data. It benchmarks ROCKET against 11 DL/transformer methods and ConvTran on the ActBeCalf dataset, revealing ROCKET's superior macro-recall of $0.77$ over the best DL models. The authors attribute ROCKET’s edge to its high-dimensional, diverse feature extraction within a simple classification framework, while DL approaches appear to struggle with generalisation on this dataset. They validate the modelling choices through a careful calf-level data separation and cross-validation, and discuss implications for practical, scalable precision livestock farming. The results suggest that robust feature-based methods can outperform complex DL models in real-world animal behaviour tracking, guiding future work toward hybrid pipelines and larger, more balanced datasets for DL gains.

Abstract

In recent years, there has been considerable progress in research on human activity recognition using data from wearable sensors. This technology also has potential in the context of animal welfare in livestock science. In this paper, we report on research on animal activity recognition in support of welfare monitoring. The data comes from collar-mounted accelerometer sensors worn by Holstein and Jersey calves, the objective being to detect changes in behaviour indicating sickness or stress. A key requirement in detecting changes in behaviour is to be able to classify activities into classes, such as drinking, running or walking. In Machine Learning terms, this is a time-series classification task, and in recent years, the Rocket family of methods have emerged as the state-of-the-art in this area. We have over 27 hours of labelled time-series data from 30 calves for our analysis. Using this data as a baseline, we present Rocket's performance on a 6-class classification task. Then, we compare this against the performance of 11 Deep Learning (DL) methods that have been proposed as promising methods for time-series classification. Given the success of DL in related areas, it is reasonable to expect that these methods will perform well here as well. Surprisingly, despite taking care to ensure that the DL methods are configured correctly, none of them match Rocket's performance. A possible explanation for the impressive success of Rocket is that it has the data encoding benefits of DL models in a much simpler classification framework.

A Comparison of Deep Learning and Established Methods for Calf Behaviour Monitoring

TL;DR

This study tackles automated calf welfare monitoring by classifying calf behaviours from collar accelerometer data. It benchmarks ROCKET against 11 DL/transformer methods and ConvTran on the ActBeCalf dataset, revealing ROCKET's superior macro-recall of over the best DL models. The authors attribute ROCKET’s edge to its high-dimensional, diverse feature extraction within a simple classification framework, while DL approaches appear to struggle with generalisation on this dataset. They validate the modelling choices through a careful calf-level data separation and cross-validation, and discuss implications for practical, scalable precision livestock farming. The results suggest that robust feature-based methods can outperform complex DL models in real-world animal behaviour tracking, guiding future work toward hybrid pipelines and larger, more balanced datasets for DL gains.

Abstract

In recent years, there has been considerable progress in research on human activity recognition using data from wearable sensors. This technology also has potential in the context of animal welfare in livestock science. In this paper, we report on research on animal activity recognition in support of welfare monitoring. The data comes from collar-mounted accelerometer sensors worn by Holstein and Jersey calves, the objective being to detect changes in behaviour indicating sickness or stress. A key requirement in detecting changes in behaviour is to be able to classify activities into classes, such as drinking, running or walking. In Machine Learning terms, this is a time-series classification task, and in recent years, the Rocket family of methods have emerged as the state-of-the-art in this area. We have over 27 hours of labelled time-series data from 30 calves for our analysis. Using this data as a baseline, we present Rocket's performance on a 6-class classification task. Then, we compare this against the performance of 11 Deep Learning (DL) methods that have been proposed as promising methods for time-series classification. Given the success of DL in related areas, it is reasonable to expect that these methods will perform well here as well. Surprisingly, despite taking care to ensure that the DL methods are configured correctly, none of them match Rocket's performance. A possible explanation for the impressive success of Rocket is that it has the data encoding benefits of DL models in a much simpler classification framework.
Paper Structure (28 sections, 7 figures, 5 tables)

This paper contains 28 sections, 7 figures, 5 tables.

Figures (7)

  • Figure 1: A comparison of accuracies of our DL model implementations against some published baselines.
  • Figure 2: Overview of the methodology used to compare classification performance and to assess generalisation capability across calves.
  • Figure 3: Methodology followed to ensure data generalisation.
  • Figure 4: Macro-recall, macro-precision, and macro-F1 scores for each of the models
  • Figure 5: Class-level Precision for ROCKET, ConvTran the top performing DL algorithms.
  • ...and 2 more figures