Table of Contents
Fetching ...

Generating Print-Ready Personalized AI Art Products from Minimal User Inputs

Noah Pursell, Anindya Maiti

TL;DR

The paper tackles producing print-ready AI art at large formats by addressing two core bottlenecks: prompt engineering complexity and native low resolution of diffusion models. It introduces a two-pronged pipeline comprising enhanced prompt generation (three methods: LLM-based, LLM with RAG-based multishot, and RAG-based templating) and advanced upscaling (nine evaluated upscalers) to convert minimal user input into high-resolution prints, demonstrated with Stable Diffusion XL. The study provides a systematic comparison of prompt-generation strategies and upscaling techniques, offering practical guidance on cost, flexibility, diversity, and image quality, and showing how to achieve $4096\times4096$ outputs from $1024\times1024$ generations. The work advances the accessibility and commercial viability of AI art by enabling consumer, designer, and business users to produce large-format, print-ready images with a streamlined, end-to-end workflow.

Abstract

We present a novel framework to advance generative artificial intelligence (AI) applications in the realm of printed art products, specifically addressing large-format products that require high-resolution artworks. The framework consists of a pipeline that addresses two major challenges in the domain: the high complexity of generating effective prompts, and the low native resolution of images produced by diffusion models. By integrating AI-enhanced prompt generations with AI-powered upscaling techniques, our framework can efficiently produce high-quality, diverse artistic images suitable for many new commercial use cases. Our work represents a significant step towards democratizing high-quality AI art, opening new avenues for consumers, artists, designers, and businesses.

Generating Print-Ready Personalized AI Art Products from Minimal User Inputs

TL;DR

The paper tackles producing print-ready AI art at large formats by addressing two core bottlenecks: prompt engineering complexity and native low resolution of diffusion models. It introduces a two-pronged pipeline comprising enhanced prompt generation (three methods: LLM-based, LLM with RAG-based multishot, and RAG-based templating) and advanced upscaling (nine evaluated upscalers) to convert minimal user input into high-resolution prints, demonstrated with Stable Diffusion XL. The study provides a systematic comparison of prompt-generation strategies and upscaling techniques, offering practical guidance on cost, flexibility, diversity, and image quality, and showing how to achieve outputs from generations. The work advances the accessibility and commercial viability of AI art by enabling consumer, designer, and business users to produce large-format, print-ready images with a streamlined, end-to-end workflow.

Abstract

We present a novel framework to advance generative artificial intelligence (AI) applications in the realm of printed art products, specifically addressing large-format products that require high-resolution artworks. The framework consists of a pipeline that addresses two major challenges in the domain: the high complexity of generating effective prompts, and the low native resolution of images produced by diffusion models. By integrating AI-enhanced prompt generations with AI-powered upscaling techniques, our framework can efficiently produce high-quality, diverse artistic images suitable for many new commercial use cases. Our work represents a significant step towards democratizing high-quality AI art, opening new avenues for consumers, artists, designers, and businesses.
Paper Structure (17 sections, 16 figures, 2 tables)

This paper contains 17 sections, 16 figures, 2 tables.

Figures (16)

  • Figure 1: Overview of our framework for creating printed AI art products, with minimal user inputs.
  • Figure 2: Overview of Language Model (LLM) Based Generation.
  • Figure 3: Overview of LLM with RAG-Based Multishot.
  • Figure 4: Caption of the figure.
  • Figure 5: Prompt enhancement times using the three proposed methods.
  • ...and 11 more figures