Table of Contents
Fetching ...

Towards Secure MLOps: Surveying Attacks, Mitigation Strategies, and Research Challenges

Raj Patel, Himanshu Tripathi, Jasper Stone, Noorbakhsh Amiri Golilarz, Sudip Mittal, Shahram Rahimi, Vini Chaudhary

TL;DR

This paper addresses the security of MLOps by applying the MITRE ATLAS framework to map AI-centric attacks across the full ML lifecycle. It delivers a lifecycle-oriented taxonomy and threat model, grounded in real-world incidents and red-team exercises, and aligns a comprehensive set of mitigations with three phases: Design/Setup, Model Development and Evaluation, and Operations. The contributions include an ATLAS-based attack taxonomy, a corresponding protection framework (including AI BOM, supply-chain verifications, adversarial training, and access controls), and a discussion of research gaps and practical guidance for integrating security from inception. The work advances practical security for MLOps and LLMOps, offering structured guidance to practitioners and researchers on defending end-to-end AI-enabled systems amid evolving cyber threats.

Abstract

The rapid adoption of machine learning (ML) technologies has driven organizations across diverse sectors to seek efficient and reliable methods to accelerate model development-to-deployment. Machine Learning Operations (MLOps) has emerged as an integrative approach addressing these requirements by unifying relevant roles and streamlining ML workflows. As the MLOps market continues to grow, securing these pipelines has become increasingly critical. However, the unified nature of MLOps ecosystem introduces vulnerabilities, making them susceptible to adversarial attacks where a single misconfiguration can lead to compromised credentials, severe financial losses, damaged public trust, and the poisoning of training data. Our paper presents a systematic application of the MITRE ATLAS (Adversarial Threat Landscape for Artificial-Intelligence Systems) framework, supplemented by reviews of white and grey literature, to systematically assess attacks across different phases of the MLOps ecosystem. We begin by reviewing prior work in this domain, then present our taxonomy and introduce a threat model that captures attackers with different knowledge and capabilities. We then present a structured taxonomy of attack techniques explicitly mapped to corresponding phases of the MLOps ecosystem, supported by examples drawn from red-teaming exercises and real-world incidents. This is followed by a taxonomy of mitigation strategies aligned with these attack categories, offering actionable early-stage defenses to strengthen the security of MLOps ecosystem. Given the gradual evolution and adoption of MLOps, we further highlight key research gaps that require immediate attention. Our work emphasizes the importance of implementing robust security protocols from the outset, empowering practitioners to safeguard MLOps ecosystem against evolving cyber attacks.

Towards Secure MLOps: Surveying Attacks, Mitigation Strategies, and Research Challenges

TL;DR

This paper addresses the security of MLOps by applying the MITRE ATLAS framework to map AI-centric attacks across the full ML lifecycle. It delivers a lifecycle-oriented taxonomy and threat model, grounded in real-world incidents and red-team exercises, and aligns a comprehensive set of mitigations with three phases: Design/Setup, Model Development and Evaluation, and Operations. The contributions include an ATLAS-based attack taxonomy, a corresponding protection framework (including AI BOM, supply-chain verifications, adversarial training, and access controls), and a discussion of research gaps and practical guidance for integrating security from inception. The work advances practical security for MLOps and LLMOps, offering structured guidance to practitioners and researchers on defending end-to-end AI-enabled systems amid evolving cyber threats.

Abstract

The rapid adoption of machine learning (ML) technologies has driven organizations across diverse sectors to seek efficient and reliable methods to accelerate model development-to-deployment. Machine Learning Operations (MLOps) has emerged as an integrative approach addressing these requirements by unifying relevant roles and streamlining ML workflows. As the MLOps market continues to grow, securing these pipelines has become increasingly critical. However, the unified nature of MLOps ecosystem introduces vulnerabilities, making them susceptible to adversarial attacks where a single misconfiguration can lead to compromised credentials, severe financial losses, damaged public trust, and the poisoning of training data. Our paper presents a systematic application of the MITRE ATLAS (Adversarial Threat Landscape for Artificial-Intelligence Systems) framework, supplemented by reviews of white and grey literature, to systematically assess attacks across different phases of the MLOps ecosystem. We begin by reviewing prior work in this domain, then present our taxonomy and introduce a threat model that captures attackers with different knowledge and capabilities. We then present a structured taxonomy of attack techniques explicitly mapped to corresponding phases of the MLOps ecosystem, supported by examples drawn from red-teaming exercises and real-world incidents. This is followed by a taxonomy of mitigation strategies aligned with these attack categories, offering actionable early-stage defenses to strengthen the security of MLOps ecosystem. Given the gradual evolution and adoption of MLOps, we further highlight key research gaps that require immediate attention. Our work emphasizes the importance of implementing robust security protocols from the outset, empowering practitioners to safeguard MLOps ecosystem against evolving cyber attacks.

Paper Structure

This paper contains 22 sections, 5 figures, 2 tables.

Figures (5)

  • Figure 1: The MLOps ecosystem begins by identifying business needs, followed by administrative setup, model development, and deployment. Continuous monitoring then provides essential feedback, enabling iterative improvements.
  • Figure 2: Venn diagram of potential operational roles contributing to secure MLOps ecosystem development.
  • Figure 3: An overview of our proposed taxonomy. The MLOps lifecycle is structured into three distinct and non-overlapping families: Design and Setup, Model Development and Evaluation, and Operations. The bullet points listed under each family specify the corresponding phases within the lifecycle. This organizational structure is adapted from the principles in ml-ops mlops_principles_innoq and is equally applicable to LLMOps.
  • Figure 4: Attacks are organized according to the taxonomic categories presented in Section \ref{['sec:survey_taxonomy']}. The three groups act as analytical lenses rather than literal or compressed lifecycle phases and support a structured comparison of how diverse threats concentrate around design choices, model-centric workflows, and operational deployment contexts.
  • Figure 5: Mitigation strategies organized according to the taxonomic categories presented in Section \ref{['sec:survey_taxonomy']}, corresponding to the attacks shown previously. These categories enable systematic alignment of defenses with design choices, model-centric workflows, and operational deployment contexts.