Table of Contents
Fetching ...

LEMUR Neural Network Dataset: Towards Seamless AutoML

Arash Torabi Goodarzi, Roman Kochnev, Waleed Khalid, Hojjat Torabi Goudarzi, Furui Qin, Tolgay Atinc Uzun, Yashkumar Sanjaybhai Dhameliya, Yash Kanubhai Kathiriya, Zofia Antonina Bentyn, Dmitry Ignatov, Radu Timofte

TL;DR

LEMUR introduces an open-source dataset and framework that treats neural network architectures themselves as data, providing standardized PyTorch implementations, YAML specifications, and an SQLite-backed ledger for reproducible benchmarking across common vision tasks and NLP. By integrating Optuna for automated hyperparameter optimization and offering extensive visualization and API access, it aims to accelerate AutoML research, enable fair cross-model comparisons, and support extensibility through community contributions. The system emphasizes reproducibility, traceability, and accessibility, with modular components designed to incorporate new architectures, datasets, and metrics under a unified evaluation pipeline. Collectively, LEMUR provides a comprehensive end-to-end ecosystem—from model implementations to analysis and reporting—that can serve as a foundational resource for large-scale neural network experimentation and AutoML studies.

Abstract

Neural networks are the backbone of modern artificial intelligence, but designing, evaluating, and comparing them remains labor-intensive. While numerous datasets exist for training, there are few standardized collections of the models themselves. We introduce LEMUR, an open-source dataset and framework that provides a large collection of PyTorch-based neural networks across tasks such as classification, segmentation, detection, and natural language processing. Each model follows a unified template, with configurations and results stored in a structured database to ensure consistency and reproducibility. LEMUR integrates automated hyperparameter optimization via Optuna, includes statistical analysis and visualization tools, and offers an API for seamless access to performance data. The framework is extensible, allowing researchers to add new models, datasets, or metrics without breaking compatibility. By standardizing implementations and unifying evaluation, LEMUR aims to accelerate AutoML research, enable fair benchmarking, and reduce barriers to large-scale neural network experimentation. To support adoption and collaboration, LEMUR and its plugins are released under the MIT license at: https://github.com/ABrain-One/nn-dataset https://github.com/ABrain-One/nn-plots https://github.com/ABrain-One/nn-vr

LEMUR Neural Network Dataset: Towards Seamless AutoML

TL;DR

LEMUR introduces an open-source dataset and framework that treats neural network architectures themselves as data, providing standardized PyTorch implementations, YAML specifications, and an SQLite-backed ledger for reproducible benchmarking across common vision tasks and NLP. By integrating Optuna for automated hyperparameter optimization and offering extensive visualization and API access, it aims to accelerate AutoML research, enable fair cross-model comparisons, and support extensibility through community contributions. The system emphasizes reproducibility, traceability, and accessibility, with modular components designed to incorporate new architectures, datasets, and metrics under a unified evaluation pipeline. Collectively, LEMUR provides a comprehensive end-to-end ecosystem—from model implementations to analysis and reporting—that can serve as a foundational resource for large-scale neural network experimentation and AutoML studies.

Abstract

Neural networks are the backbone of modern artificial intelligence, but designing, evaluating, and comparing them remains labor-intensive. While numerous datasets exist for training, there are few standardized collections of the models themselves. We introduce LEMUR, an open-source dataset and framework that provides a large collection of PyTorch-based neural networks across tasks such as classification, segmentation, detection, and natural language processing. Each model follows a unified template, with configurations and results stored in a structured database to ensure consistency and reproducibility. LEMUR integrates automated hyperparameter optimization via Optuna, includes statistical analysis and visualization tools, and offers an API for seamless access to performance data. The framework is extensible, allowing researchers to add new models, datasets, or metrics without breaking compatibility. By standardizing implementations and unifying evaluation, LEMUR aims to accelerate AutoML research, enable fair benchmarking, and reduce barriers to large-scale neural network experimentation. To support adoption and collaboration, LEMUR and its plugins are released under the MIT license at: https://github.com/ABrain-One/nn-dataset https://github.com/ABrain-One/nn-plots https://github.com/ABrain-One/nn-vr

Paper Structure

This paper contains 23 sections, 12 figures, 3 tables.

Figures (12)

  • Figure 1: A high level illustration of the LEMUR pipeline, including components like preprocessing, dynamic task allocation, JSON handling, Optuna hyperparameter optimization with additional user specified neural networks, data loaders, transformations, and metric evaluation.
  • Figure 2: Tree-like diagram illustrating examples of supported tasks, neural network architectures, transformations, metrics, and datasets. The framework's modularity allows users to add custom loaders, networks, and metrics for experimentation.
  • Figure 3: Scatter plot showing the relationship between accuracy and training time (in nanoseconds) for different tasks (image classification, image segmentation, and object detection). Image classification demonstrates rapid accuracy improvement with lower training times and achieves high accuracy consistently. Image segmentation exhibits slower improvements with moderate accuracy, while object detection has lower initial accuracy and requires longer training times to stabilize. The plot emphasizes the varying computational demands and learning behaviors across tasks
  • Figure 4: Scatter plot illustrating the variation of accuracy across epochs for different tasks (image classification, image segmentation, and object detection). The plot highlights distinct task-specific accuracy trends, with image classification showing faster and more consistent improvements, while image segmentation and object detection exhibit lower initial accuracy and more gradual improvements. Consistency in accuracy increases with epochs for all tasks, reflecting convergence and stabilization over time.
  • Figure 5: Accuracy trends across various datasets. Subfigures (a)--(f) highlight the accuracy progression for different models over epochs. Simpler datasets like MNIST and CIFAR-10 show rapid convergence and higher accuracy, while complex datasets like CIFAR-100 and Places365 exhibit slower progress and more variance. Models like AirNet and AlexNet consistently demonstrate robust performance across tasks.
  • ...and 7 more figures