Table of Contents
Fetching ...

PyMilo: A Python Library for ML I/O

AmirHosein Rostami, Sepand Haghighi, Sadra Sabouri, Alireza Zolanvari

TL;DR

PyMilo presents a transparent, end-to-end solution for exporting and importing ML artifacts using a JSON-based, non-executable format that preserves original structure and enables safe deployment. It introduces a Chain of Responsibility Transporter network to serialize diverse data structures and an ML Streaming framework for server-client web deployment, including security via encryption. The paper demonstrates practical demonstration with a LinearRegression example and a rigorous quality-control pipeline across multiple Python versions and platforms, ensuring post-transport fidelity within strict tolerances. The work aims to improve transparency, safety, and portability in AI tooling, with plans to broaden framework support and protocol compatibility for broader adoption.

Abstract

PyMilo is an open-source Python package that addresses the limitations of existing Machine Learning (ML) model storage formats by providing a transparent, reliable, and safe method for exporting and deploying trained models. Current formats, such as pickle and other binary formats, have significant problems, such as reliability, safety, and transparency issues. In contrast, PyMilo serializes ML models in a transparent non-executable format, enabling straightforward and safe model exchange, while also facilitating the deserialization and deployment of exported models in production environments. This package aims to provide a seamless, end-to-end solution for the exportation and importation of pre-trained ML models, which simplifies the model development and deployment pipeline.

PyMilo: A Python Library for ML I/O

TL;DR

PyMilo presents a transparent, end-to-end solution for exporting and importing ML artifacts using a JSON-based, non-executable format that preserves original structure and enables safe deployment. It introduces a Chain of Responsibility Transporter network to serialize diverse data structures and an ML Streaming framework for server-client web deployment, including security via encryption. The paper demonstrates practical demonstration with a LinearRegression example and a rigorous quality-control pipeline across multiple Python versions and platforms, ensuring post-transport fidelity within strict tolerances. The work aims to improve transparency, safety, and portability in AI tooling, with plans to broaden framework support and protocol compatibility for broader adoption.

Abstract

PyMilo is an open-source Python package that addresses the limitations of existing Machine Learning (ML) model storage formats by providing a transparent, reliable, and safe method for exporting and deploying trained models. Current formats, such as pickle and other binary formats, have significant problems, such as reliability, safety, and transparency issues. In contrast, PyMilo serializes ML models in a transparent non-executable format, enabling straightforward and safe model exchange, while also facilitating the deserialization and deployment of exported models in production environments. This package aims to provide a seamless, end-to-end solution for the exportation and importation of pre-trained ML models, which simplifies the model development and deployment pipeline.
Paper Structure (11 sections, 5 figures, 2 tables)

This paper contains 11 sections, 5 figures, 2 tables.

Figures (5)

  • Figure 1: PyMilo is an end-to-end, transparent, and safe solution for transporting machine learning models from machine learning frameworks to target devices. Unlike other tools that transform models into alternative representations with structural differences, PyMilo preserves the original model's structure, allowing it to be imported back as the exact same object in its native framework.
  • Figure 2: PyMilo's Transporter network is designed for scikit-learn based ML models. It uses the Chain of Responsibility pattern to construct specialized chains. Each chain has Transporters that manage the serialization and deserialization of specific data structures. Distinct chains are established for different machine learning model categories, such as decision trees and clustering.
  • Figure 3: ML Streaming enables the integration of ML models into web services through a server/client architecture. Users can interact with PyMilo-exported models hosted on a PyMiloServer using a PyMiloClient instance.
  • Figure 4: PyMilo's ML Streaming architecture ensures secure and efficient communication between clients and servers. Data transmitted from the PyMiloClient is compressed and encrypted for secure communication, and then decrypted and decompressed upon arrival at the PyMiloServer for processing. The server's response is subsequently compressed and encrypted before being transmitted back to the client, ensuring end-to-end security.
  • Figure 5: Sample contents of a model exported in a model.json file, illustrating the structure of the model.