Domain Adaptable Fine-Tune Distillation Framework For Advancing Farm Surveillance

Raza Imam; Muhammad Huzaifa; Nabil Mansour; Shaher Bano Mirza; Fouad Lamghari

Domain Adaptable Fine-Tune Distillation Framework For Advancing Farm Surveillance

Raza Imam, Muhammad Huzaifa, Nabil Mansour, Shaher Bano Mirza, Fouad Lamghari

TL;DR

This work tackles real-time, domain-adaptable camel farm surveillance by coupling a Unified Auto-Annotation framework (GroundingDINO combined with SAM) with a Fine-Tune Distillation pipeline that transfers knowledge from a large teacher to a lightweight student (e.g., YOLOv8). The approach enables automatic labeling of surveillance frames and distills powerful generalization into an edge-deployable detector, validated on data from Al-Marmoom Camel Farm. Among tested configurations, YOLOv8s trained for 50 epochs at 1024-pixel resolution achieved the best balance of accuracy (AP ≈ 80.3%), speed, and resource use, making it suitable for real-time monitoring. The framework reduces labeling effort, offers transparency in training, and supports domain adaptation to other farms or livestock tasks, with open-source code available for reproduction and extension.

Abstract

In this study, we propose an automated framework for camel farm monitoring, introducing two key contributions: the Unified Auto-Annotation framework and the Fine-Tune Distillation framework. The Unified Auto-Annotation approach combines two models, GroundingDINO (GD), and Segment-Anything-Model (SAM), to automatically annotate raw datasets extracted from surveillance videos. Building upon this foundation, the Fine-Tune Distillation framework conducts fine-tuning of student models using the auto-annotated dataset. This process involves transferring knowledge from a large teacher model to a student model, resembling a variant of Knowledge Distillation. The Fine-Tune Distillation framework aims to be adaptable to specific use cases, enabling the transfer of knowledge from the large models to the small models, making it suitable for domain-specific applications. By leveraging our raw dataset collected from Al-Marmoom Camel Farm in Dubai, UAE, and a pre-trained teacher model, GroundingDINO, the Fine-Tune Distillation framework produces a lightweight deployable model, YOLOv8. This framework demonstrates high performance and computational efficiency, facilitating efficient real-time object detection. Our code is available at \href{https://github.com/Razaimam45/Fine-Tune-Distillation}{https://github.com/Razaimam45/Fine-Tune-Distillation}

Domain Adaptable Fine-Tune Distillation Framework For Advancing Farm Surveillance

TL;DR

Abstract

Paper Structure (14 sections, 5 equations, 12 figures, 6 tables)

This paper contains 14 sections, 5 equations, 12 figures, 6 tables.

Introduction
Literature Review
Research Design and Preprocessing
Proposed Method
Unified Auto-Annotation Framework
Fine-Tune Distillation in Real-time monitoring
Experiments
Setup
Models and Metrics
Results
Performance Analysis
Computational Analysis
Discussion and Limitation
Conclusion and Future Work

Figures (12)

Figure 1: Overview of the Knowledge Transfer in Fine-Tune Distillation
Figure 2: An overview of the comprehensive research design implemented in this work
Figure 3: Sample examples of our dataset following the the preprocessed phase
Figure 4: Data distribution before and after augmentation stage
Figure 5: Zero-Shot Inference on our dataset images utilizing GroundingDINO with the class prompts "camel", "rope", "mask", and "pole" as the four classes of interest in the context of the taming process. (red BB (Bounding Box) denotes class "camel", green denotes "rope", yellow denotes "mask", and blue BB denotes "pole")
...and 7 more figures

Domain Adaptable Fine-Tune Distillation Framework For Advancing Farm Surveillance

TL;DR

Abstract

Domain Adaptable Fine-Tune Distillation Framework For Advancing Farm Surveillance

Authors

TL;DR

Abstract

Table of Contents

Figures (12)