Dynamic and Super-Personalized Media Ecosystem Driven by Generative AI: Unpredictable Plays Never Repeating The Same
Sungjun Ahn, Hyun-Jeong Yim, Youngwan Lee, Sung-Ik Park
TL;DR
The paper proposes a GenAI-driven media casting paradigm in which content is generated at receiver devices from semantic prompts delivered by prompt supply networks (PSN), reducing pixel-level data transmission and enabling super-personalized, non-repeating short-form content. It introduces a formal architectural framework consisting of prompt-driven media (PDM), a generative receive terminal (GRT), and a content generator (CG), along with single-source operations (O_S) and multi-source operations (O_M) that support dynamic, multi-provider content assembly. The work surveys advances in text-to-image, image-to-video, and text-to-video generation, motivates structured prompt design, and argues for a new semantic media protocol to replace legacy MPEG-7-like descriptions with richer, graph-based representations. Use cases emphasize fatigue-free digital advertising and efficient delivery in constrained channels, while future directions discuss dynamic coordination across broadcast-broadband convergence and broader infrastructure applications, positioning GMC as a new supply-side model for GenAI-driven media.
Abstract
This paper introduces a media service model that exploits artificial intelligence (AI) video generators at the receive end. This proposal deviates from the traditional multimedia ecosystem, completely relying on in-house production, by shifting part of the content creation onto the receiver. We bring a semantic process into the framework, allowing the distribution network to provide service elements that prompt the content generator, rather than distributing encoded data of fully finished programs. The service elements include fine-tailored text descriptions, lightweight image data of some objects, or application programming interfaces, comprehensively referred to as semantic sources, and the user terminal translates the received semantic data into video frames. Empowered by the random nature of generative AI, the users could then experience super-personalized services accordingly. The proposed idea incorporates the situations in which the user receives different service providers' element packages; a sequence of packages over time, or multiple packages at the same time. Given promised in-context coherence and content integrity, the combinatory dynamics will amplify the service diversity, allowing the users to always chance upon new experiences. This work particularly aims at short-form videos and advertisements, which the users would easily feel fatigued by seeing the same frame sequence every time. In those use cases, the content provider's role will be recast as scripting semantic sources, transformed from a thorough producer. Overall, this work explores a new form of media ecosystem facilitated by receiver-embedded generative models, featuring both random content dynamics and enhanced delivery efficiency simultaneously.
