Table of Contents
Fetching ...

Open-Source Assessments of AI Capabilities: The Proliferation of AI Analysis Tools, Replicating Competitor Models, and the Zhousidun Dataset

Ritwik Gupta, Leah Walker, Eli Glickman, Raine Koizumi, Sarthak Bhatnagar, Andrew W. Reddie

TL;DR

This work presents an open-source methodology to assess military AI capabilities using the Zhousidun dataset, a publicly released Chinese-origin image collection of US and allied destroyers with annotated components. By replicating a near-state-of-the-art detector (YOLOv8l) on Zhousidun and evaluating on both real-like test data ($mAP$ at $IoU=0.50$) and synthetic scenes, the paper reveals strong in-distribution performance ($mAP=0.926$) but limited out-of-distribution effectiveness ($mAP≈0.45$, recall ≈0.26, precision ≈0.87). The findings highlight data quality and domain-shift limitations when training on web-scraped imagery and demonstrate how synthetic data can bootstrap more robust detectors, informing open-source net assessment. Overall, the work proposes a robust, repeatable framework for evaluating AI-enabled military capabilities using public data and open-source tools, with implications for strategic analysis and force-planning.

Abstract

The integration of artificial intelligence (AI) into military capabilities has become a norm for major military power across the globe. Understanding how these AI models operate is essential for maintaining strategic advantages and ensuring security. This paper demonstrates an open-source methodology for analyzing military AI models through a detailed examination of the Zhousidun dataset, a Chinese-originated dataset that exhaustively labels critical components on American and Allied destroyers. By demonstrating the replication of a state-of-the-art computer vision model on this dataset, we illustrate how open-source tools can be leveraged to assess and understand key military AI capabilities. This methodology offers a robust framework for evaluating the performance and potential of AI-enabled military capabilities, thus enhancing the accuracy and reliability of strategic assessments.

Open-Source Assessments of AI Capabilities: The Proliferation of AI Analysis Tools, Replicating Competitor Models, and the Zhousidun Dataset

TL;DR

This work presents an open-source methodology to assess military AI capabilities using the Zhousidun dataset, a publicly released Chinese-origin image collection of US and allied destroyers with annotated components. By replicating a near-state-of-the-art detector (YOLOv8l) on Zhousidun and evaluating on both real-like test data ( at ) and synthetic scenes, the paper reveals strong in-distribution performance () but limited out-of-distribution effectiveness (, recall ≈0.26, precision ≈0.87). The findings highlight data quality and domain-shift limitations when training on web-scraped imagery and demonstrate how synthetic data can bootstrap more robust detectors, informing open-source net assessment. Overall, the work proposes a robust, repeatable framework for evaluating AI-enabled military capabilities using public data and open-source tools, with implications for strategic analysis and force-planning.

Abstract

The integration of artificial intelligence (AI) into military capabilities has become a norm for major military power across the globe. Understanding how these AI models operate is essential for maintaining strategic advantages and ensuring security. This paper demonstrates an open-source methodology for analyzing military AI models through a detailed examination of the Zhousidun dataset, a Chinese-originated dataset that exhaustively labels critical components on American and Allied destroyers. By demonstrating the replication of a state-of-the-art computer vision model on this dataset, we illustrate how open-source tools can be leveraged to assess and understand key military AI capabilities. This methodology offers a robust framework for evaluating the performance and potential of AI-enabled military capabilities, thus enhancing the accuracy and reliability of strategic assessments.
Paper Structure (14 sections, 8 figures)

This paper contains 14 sections, 8 figures.

Figures (8)

  • Figure 1: Images from the Zhousidun dataset, which feature various military vessels with overlaid bounding boxes on SPY radars and VLS missile-launching components.
  • Figure 2: Additional examples of images from the Zhousidun dataset.
  • Figure 3: (left) Distribution of types of sensors used in the analyzed papers. (right) Distribution of the countries of origin for the institutions represented in the analyzed papers. A full list of papers is in Appendix \ref{['app:papers']}.
  • Figure 4: A synthetic scene of the USS Arleigh Burke with positions of captured images.
  • Figure 5: Predictions on the Zhousidun test set.
  • ...and 3 more figures