A Survey on Generative Modeling with Limited Data, Few Shots, and Zero Shot
Milad Abdollahzadeh, Guimeng Liu, Touba Malekzadeh, Christopher T. H. Teo, Keshigeyan Chandrasegaran, Ngai-Man Cheung
TL;DR
This survey tackles Generative Modeling under Data Constraint (GM-DC), addressing how GANs, VAEs, and diffusion models perform under limited, few-shot, and zero-shot data. It introduces two novel taxonomies—one for GM-DC tasks and another for GM-DC approaches—and provides a comprehensive review of 230+ papers, complemented by a Sankey diagram to map task-approach-method interactions. The work highlights core challenges (overfitting, frequency bias, distant-domain transfer, evaluation) and synthesizes practical recommendations across transfer learning, data augmentation, architectural design, multi-task objectives, frequency-aware methods, meta-learning, and internal patch distribution modeling. It also outlines future directions, including leveraging foundation models, robust zero-shot grounding, distant-domain transfer, holistic evaluation, and data-centric strategies, aiming to guide researchers and practitioners in advancing GM-DC."
Abstract
Generative modeling in machine learning aims to synthesize new data samples that are statistically similar to those observed during training. While conventional generative models such as GANs and diffusion models typically assume access to large and diverse datasets, many real-world applications (e.g. in medicine, satellite imaging, and artistic domains) operate under limited data availability and strict constraints. In this survey, we examine Generative Modeling under Data Constraint (GM-DC), which includes limited-data, few-shot, and zero-shot settings. We present a unified perspective on the key challenges in GM-DC, including overfitting, frequency bias, and incompatible knowledge transfer, and discuss how these issues impact model performance. To systematically analyze this growing field, we introduce two novel taxonomies: one categorizing GM-DC tasks (e.g. unconditional vs. conditional generation, cross-domain adaptation, and subject-driven modeling), and another organizing methodological approaches (e.g. transfer learning, data augmentation, meta-learning, and frequency-aware modeling). Our study reviews over 230 papers, offering a comprehensive view across generative model types and constraint scenarios. We further analyze task-approach-method interactions using a Sankey diagram and highlight promising directions for future work, including adaptation of foundation models, holistic evaluation frameworks, and data-centric strategies for sample selection. This survey provides a timely and practical roadmap for researchers and practitioners aiming to advance generative modeling under limited data. Project website: https://sutd-visual-computing-group.github.io/gmdc-survey/.
