Table of Contents
Fetching ...

Diffusion Models on the Edge: Challenges, Optimizations, and Applications

Dongqi Zheng

TL;DR

Diffusion models offer high-fidelity generative capabilities but are hindered by heavy computation on edge devices. The paper surveys foundational diffusion concepts alongside edge-specific constraints and a broad set of optimization and hardware-software co-design strategies, including sampling acceleration, latent-space diffusion, quantization, pruning, distillation, and operator fusion. It catalogs platform-specific optimization case studies, runtime frameworks, profiling tools, and standardized benchmarks, highlighting practical pipelines for on-device deployment. By mapping techniques to concrete edge scenarios and metrics, the work outlines a path toward private, low-energy, real-time diffusion on smartphones, wearables, and IoT devices with broad societal and economic impact.

Abstract

Diffusion models have shown remarkable capabilities in generating high-fidelity data across modalities such as images, audio, and video. However, their computational intensity makes deployment on edge devices a significant challenge. This survey explores the foundational concepts of diffusion models, identifies key constraints of edge platforms, and synthesizes recent advancements in model compression, sampling efficiency, and hardware-software co-design to make diffusion models viable on edge devices. We also review promising applications and suggest future research directions.

Diffusion Models on the Edge: Challenges, Optimizations, and Applications

TL;DR

Diffusion models offer high-fidelity generative capabilities but are hindered by heavy computation on edge devices. The paper surveys foundational diffusion concepts alongside edge-specific constraints and a broad set of optimization and hardware-software co-design strategies, including sampling acceleration, latent-space diffusion, quantization, pruning, distillation, and operator fusion. It catalogs platform-specific optimization case studies, runtime frameworks, profiling tools, and standardized benchmarks, highlighting practical pipelines for on-device deployment. By mapping techniques to concrete edge scenarios and metrics, the work outlines a path toward private, low-energy, real-time diffusion on smartphones, wearables, and IoT devices with broad societal and economic impact.

Abstract

Diffusion models have shown remarkable capabilities in generating high-fidelity data across modalities such as images, audio, and video. However, their computational intensity makes deployment on edge devices a significant challenge. This survey explores the foundational concepts of diffusion models, identifies key constraints of edge platforms, and synthesizes recent advancements in model compression, sampling efficiency, and hardware-software co-design to make diffusion models viable on edge devices. We also review promising applications and suggest future research directions.

Paper Structure

This paper contains 32 sections, 1 figure, 2 tables.

Figures (1)

  • Figure 1: Illustration of the forward and reverse diffusion process used in generative models.