Table of Contents
Fetching ...

A Dataset for Mechanical Mechanisms

Farshid Ghezelbash, Amir Hossein Eskandari, Amir J Bidhendi

TL;DR

To accelerate mechanical mechanism design, the paper introduces a curated dataset of 8,994 image–description pairs, including both 2D and 3D sketches. It demonstrates the utility of the dataset by fine-tuning Stable Diffusion 1.6 to generate mechanism designs and BLIP-2 to caption them, highlighting both promise and current limitations. The results show that 3D sketches align more closely with textual prompts, while 2D sketches often lack coherent structure, and captions produced by BLIP-2 are imperfect due to limited training. The work demonstrates the potential of task-specific datasets to enable AI-assisted mechanical design and outlines concrete directions for expanding dataset size, improving models, and validating designs with domain experts.

Abstract

This study introduces a dataset consisting of approximately 9,000 images of mechanical mechanisms and their corresponding descriptions, aimed at supporting research in mechanism design. The dataset consists of a diverse collection of 2D and 3D sketches, meticulously curated to ensure relevance and quality. We demonstrate the application of this dataset by fine-tuning two models: 1) Stable Diffusion (for generating new mechanical designs), and 2) BLIP-2 (for captioning these designs). While the results from Stable Diffusion show promise, particularly in generating coherent 3D sketches, the model struggles with 2D sketches and occasionally produces nonsensical outputs. These limitations underscore the need for further development, particularly in expanding the dataset and refining model architectures. Nonetheless, this work serves as a step towards leveraging generative AI in mechanical design, highlighting both the potential and current limitations of these approaches.

A Dataset for Mechanical Mechanisms

TL;DR

To accelerate mechanical mechanism design, the paper introduces a curated dataset of 8,994 image–description pairs, including both 2D and 3D sketches. It demonstrates the utility of the dataset by fine-tuning Stable Diffusion 1.6 to generate mechanism designs and BLIP-2 to caption them, highlighting both promise and current limitations. The results show that 3D sketches align more closely with textual prompts, while 2D sketches often lack coherent structure, and captions produced by BLIP-2 are imperfect due to limited training. The work demonstrates the potential of task-specific datasets to enable AI-assisted mechanical design and outlines concrete directions for expanding dataset size, improving models, and validating designs with domain experts.

Abstract

This study introduces a dataset consisting of approximately 9,000 images of mechanical mechanisms and their corresponding descriptions, aimed at supporting research in mechanism design. The dataset consists of a diverse collection of 2D and 3D sketches, meticulously curated to ensure relevance and quality. We demonstrate the application of this dataset by fine-tuning two models: 1) Stable Diffusion (for generating new mechanical designs), and 2) BLIP-2 (for captioning these designs). While the results from Stable Diffusion show promise, particularly in generating coherent 3D sketches, the model struggles with 2D sketches and occasionally produces nonsensical outputs. These limitations underscore the need for further development, particularly in expanding the dataset and refining model architectures. Nonetheless, this work serves as a step towards leveraging generative AI in mechanical design, highlighting both the potential and current limitations of these approaches.
Paper Structure (12 sections, 5 figures, 2 tables)

This paper contains 12 sections, 5 figures, 2 tables.

Figures (5)

  • Figure 1: Word cloud of text description of mechanisms.
  • Figure 2: Nine randomly selected mechanisms with their descriptions (limited to 150 characters).
  • Figure 3: Examples of 2D (right) and 3D (middle) sketches generated by the fine-tuned Stable Diffusion model from a text input (left). The 3D sketches generally align better with the provided descriptions, capturing key components of mechanical mechanisms, while the 2D sketches often lack coherence and meaningful structure.
  • Figure 4: Examples of nonsensical/hallucinated outputs generated by the fine-tuned Stable Diffusion model from a text input (2D: right; 3D: middle; input prompt: left). These examples illustrate that the model occasionally struggles to accurately interpret prompts, resulting in outputs that do not align with the intended mechanical designs and lack meaningful structure.
  • Figure 5: Six randomly selected mechanisms with their model-generated and real captions.