Automatic Stage Lighting Control: Is it a Rule-Driven Process or Generative Task?
Zijian Zhao, Dian Jin, Zijing Zhou, Xiaoyu Zhang
TL;DR
This work reframes Automatic Stage Lighting Control as a generative, cross-modal generation task rather than a fixed rule-based mapping. It introduces Skip-BART, an end-to-end transformer model that outputs stage lighting hue and intensity from music, aided by a skip-connection to align frame-level music-light correspondences and transfer learning from PianoBART. A first-stage RPMC-L^2 dataset is released to enable learning from professional lighting data, along with pre-training, fine-tuning, and a Restricted Stochastic Temperature-Controlled sampling strategy for stable, diverse outputs. Quantitative and human evaluations show Skip-BART outperforms rule-based methods across metrics and closely approaches human lighting engineers, highlighting the potential for practical, data-driven, generative ASLC. The work also outlines limitations and avenues for future research, including real-time, multi-light control and richer multimodal signals.
Abstract
Stage lighting is a vital component in live music performances, shaping an engaging experience for both musicians and audiences. In recent years, Automatic Stage Lighting Control (ASLC) has attracted growing interest due to the high costs of hiring or training professional lighting engineers. However, most existing ASLC solutions only classify music into limited categories and map them to predefined light patterns, resulting in formulaic and monotonous outcomes that lack rationality. To address this gap, this paper presents Skip-BART, an end-to-end model that directly learns from experienced lighting engineers and predict vivid, human-like stage lighting. To the best of our knowledge, this is the first work to conceptualize ASLC as a generative task rather than merely a classification problem. Our method adapts the BART model to take audio music as input and produce light hue and value (intensity) as output, incorporating a novel skip connection mechanism to enhance the relationship between music and light within the frame grid. To address the lack of available datasets, we create the first stage lighting dataset, along with several pre-training and transfer learning techniques to improve model training with limited data. We validate our method through both quantitative analysis and an human evaluation, demonstrating that Skip-BART outperforms conventional rule-based methods across all evaluation metrics and shows only a limited gap compared to real lighting engineers. To support further research, we have made our self-collected dataset, code, and trained model parameters available at https://github.com/RS2002/Skip-BART .
