Optimal L-Systems for Stochastic L-system Inference Problems
Ali Lotfi, Ian McQuillan
TL;DR
This work tackles stochastic L-system inference by formalizing two optimization-driven problems: (i) constructing an S0L derivation that maximizes the probability of generating a given sequence θ in a single derivation, and (ii) constructing an S0L that maximizes the probability of generating θ across all derivations. It resolves these with two theorems: a sharp upper bound on p(d) and an explicit optimal per-predecessor rule, and a posynomial-optimization formulation that yields a highest-probability S0L over all derivations. An algorithm is proposed that uses nonlinear optimization and interior-point methods to compute the optimal production probabilities p*(a→y) from θ, enabling positive-data-only learning of stochastic grammars. The results have practical implications for automated plant morphology modeling, synthetic data generation for vision, and other domains requiring probabilistic, parallel string rewriting with positive-only training data.
Abstract
This paper presents two novel theorems that address two open problems in stochastic Lindenmayer-system (L-system) inference, specifically focusing on the construction of an optimal stochastic L-system capable of generating a given sequence of strings. The first theorem delineates a method for crafting a stochastic L-system that has the maximum probability of a derivation producing a given sequence of words through a single derivation (noting that multiple derivations may generate the same sequence). Furthermore, the second theorem determines the stochastic L-systems with the highest probability of producing a given sequence of words with multiple possible derivations. From these, we introduce an algorithm to infer an optimal stochastic L-system from a given sequence. This algorithm incorporates advanced optimization techniques, such as interior point methods, to ensure the creation of a stochastic L-system that maximizes the probability of generating the given sequence (allowing for multiple derivations). This allows for the use of stochastic L-systems as a model for machine learning using only positive data for training.
