WildGEN: Long-horizon Trajectory Generation for Wildlife
Ali Al-Lawati, Elsayed Eshra, Prasenjit Mitra
TL;DR
WildGEN addresses the challenge of generating realistic long-horizon wildlife trajectories from sparse real samples by employing a Variational Autoencoder with an encoder $q(z|T)$ and decoder $p(T|z)$, augmented by a Gaussian Mixture Model in latent space. Generated trajectories are refined through Savitzky-Golay smoothing and constrained by a Minimum Bounding Region, and are benchmarked against Levy Walk/Flight and Heteroscedastic GPR on Movebank geese data, achieving lower Hausdorff distances and higher cluster similarity (Pearson $r$). The approach demonstrates a data-efficient way to augment wildlife trajectory datasets, with a reported 15.5% improvement over GPR in path similarity, and provides a framework for future extensions such as Fréchet distance evaluation and normalizing flows. Overall, WildGEN offers a practical, post-processed generative solution for wildlife movement studies that can enhance training data and simulation capabilities while respecting data-collection constraints.
Abstract
Trajectory generation is an important concern in pedestrian, vehicle, and wildlife movement studies. Generated trajectories help enrich the training corpus in relation to deep learning applications, and may be used to facilitate simulation tasks. This is especially significant in the wildlife domain, where the cost of obtaining additional real data can be prohibitively expensive, time-consuming, and bear ethical considerations. In this paper, we introduce WildGEN: a conceptual framework that addresses this challenge by employing a Variational Auto-encoders (VAEs) based method for the acquisition of movement characteristics exhibited by wild geese over a long horizon using a sparse set of truth samples. A subsequent post-processing step of the generated trajectories is performed based on smoothing filters to reduce excessive wandering. Our evaluation is conducted through visual inspection and the computation of the Hausdorff distance between the generated and real trajectories. In addition, we utilize the Pearson Correlation Coefficient as a way to measure how realistic the trajectories are based on the similarity of clusters evaluated on the generated and real trajectories.
