Creative Beam Search: LLM-as-a-Judge For Improving Response Generation
Giorgio Franceschelli, Mirco Musolesi
TL;DR
Creative Beam Search (CBS) addresses the gap between human intentional creativity and current LLM generation by combining a generate phase with Diverse Beam Search (DBS) and a validate phase via LLM-as-a-Judge. Grounded in Amabile's creativity framework, CBS uses DBS to produce diverse candidates and a self-evaluation step to select the preferred solution. Qualitative results with 31 graduate students show CBS is often perceived as more creative than standard sampling, with the self-evaluation step improving selection; however, some prompts yield outputs that are too similar to decide. Limitations include the lack of true intentionality, potential biases in self-evaluation, and computational costs, with future work aiming to broaden candidate pools and optimize prompt structures for improved co-creative performance.
Abstract
Large language models are revolutionizing several areas, including artificial creativity. However, the process of generation in machines profoundly diverges from that observed in humans. In particular, machine generation is characterized by a lack of intentionality and an underlying creative process. We propose a method called Creative Beam Search that uses Diverse Beam Search and LLM-as-a-Judge to perform response generation and response validation. The results of a qualitative experiment show how our approach can provide better output than standard sampling techniques. We also show that the response validation step is a necessary complement to the response generation step.
