Table of Contents
Fetching ...

On Extending the Automatic Test Markup Language (ATML) for Machine Learning

Tyler Cody, Bingtong Li, Peter A. Beling

TL;DR

The paper tackles the lack of standardized messaging for the operational testing and evaluation of edge ML by evaluating whether IEEE Std 1671 ATML can be extended to cover ML tests, including datasets and software dependencies. It demonstrates how cross-validation, adversarial robustness, and drift detection can be described as ATML TestDescriptions, and discusses where custom extensions or 1671.X may be required. It argues that only minor extensions may be needed to address ML-specific concepts, and compares ATML with model-centric standards such as PMML and ONNX, highlighting ATML's role in operational T&E and governance. It outlines future work to implement an ATS using ATML with open-source ML software/hardware and to clarify interactions with PMML/ONNX within open-architecture efforts for AI-enabled systems.

Abstract

This paper addresses the urgent need for messaging standards in the operational test and evaluation (T&E) of machine learning (ML) applications, particularly in edge ML applications embedded in systems like robots, satellites, and unmanned vehicles. It examines the suitability of the IEEE Standard 1671 (IEEE Std 1671), known as the Automatic Test Markup Language (ATML), an XML-based standard originally developed for electronic systems, for ML application testing. The paper explores extending IEEE Std 1671 to encompass the unique challenges of ML applications, including the use of datasets and dependencies on software. Through modeling various tests such as adversarial robustness and drift detection, this paper offers a framework adaptable to specific applications, suggesting that minor modifications to ATML might suffice to address the novelties of ML. This paper differentiates ATML's focus on testing from other ML standards like Predictive Model Markup Language (PMML) or Open Neural Network Exchange (ONNX), which concentrate on ML model specification. We conclude that ATML is a promising tool for effective, near real-time operational T&E of ML applications, an essential aspect of AI lifecycle management, safety, and governance.

On Extending the Automatic Test Markup Language (ATML) for Machine Learning

TL;DR

The paper tackles the lack of standardized messaging for the operational testing and evaluation of edge ML by evaluating whether IEEE Std 1671 ATML can be extended to cover ML tests, including datasets and software dependencies. It demonstrates how cross-validation, adversarial robustness, and drift detection can be described as ATML TestDescriptions, and discusses where custom extensions or 1671.X may be required. It argues that only minor extensions may be needed to address ML-specific concepts, and compares ATML with model-centric standards such as PMML and ONNX, highlighting ATML's role in operational T&E and governance. It outlines future work to implement an ATS using ATML with open-source ML software/hardware and to clarify interactions with PMML/ONNX within open-architecture efforts for AI-enabled systems.

Abstract

This paper addresses the urgent need for messaging standards in the operational test and evaluation (T&E) of machine learning (ML) applications, particularly in edge ML applications embedded in systems like robots, satellites, and unmanned vehicles. It examines the suitability of the IEEE Standard 1671 (IEEE Std 1671), known as the Automatic Test Markup Language (ATML), an XML-based standard originally developed for electronic systems, for ML application testing. The paper explores extending IEEE Std 1671 to encompass the unique challenges of ML applications, including the use of datasets and dependencies on software. Through modeling various tests such as adversarial robustness and drift detection, this paper offers a framework adaptable to specific applications, suggesting that minor modifications to ATML might suffice to address the novelties of ML. This paper differentiates ATML's focus on testing from other ML standards like Predictive Model Markup Language (PMML) or Open Neural Network Exchange (ONNX), which concentrate on ML model specification. We conclude that ATML is a promising tool for effective, near real-time operational T&E of ML applications, an essential aspect of AI lifecycle management, safety, and governance.
Paper Structure (14 sections, 12 figures)

This paper contains 14 sections, 12 figures.

Figures (12)

  • Figure 1: Example test description for digital multimeter unit (DMU).
  • Figure 2: Test description for cross-validation.
  • Figure 3: Test description for cross-validation with reference dataset.
  • Figure 4: Test description for adversarial robustness test.
  • Figure 5: Test description for multiple adversarial tests.
  • ...and 7 more figures