Evaluating machine learning models for predicting pesticide toxicity to honey bees
Jakub Adamczyk, Jakub Poziemski, Pawel Siedlecki
TL;DR
This work benchmarks machine learning approaches for predicting pesticide toxicity to honey bees using the ApisTox dataset, revealing a distinct agrochemical chemical space from medicinal benchmarks. Across methods, molecular fingerprints and WL-OA graph kernels outperform graph neural networks and pretrained models, suggesting current embeddings are biased toward drug-like chemistry and do not generalize well to agrochemicals. The study also demonstrates the value of interpretable techniques, such as counterfactual explanations, to validate model behavior and support regulatory considerations. Overall, the findings highlight the need for domain-specific datasets and models in agrochemistry to improve predictive ecotoxicology and safer pesticide design.
Abstract
Small molecules play a critical role in the biomedical, environmental, and agrochemical domains, each with distinct physicochemical requirements and success criteria. Although biomedical research benefits from extensive datasets and established benchmarks, agrochemical data remain scarce, particularly with respect to species-specific toxicity. This work focuses on ApisTox, the most comprehensive dataset of experimentally validated chemical toxicity to the honey bee (\textit{Apis mellifera}), an ecologically vital pollinator. The primary goal of this study was to determine the suitability of diverse machine learning approaches for modeling such toxicity, including molecular fingerprints, graph kernels, and graph neural networks, as well as pretrained models. Comparative analysis with medicinal datasets from the MoleculeNet benchmark reveals that ApisTox represents a distinct chemical space. Performance degradation on non-medicinal datasets, such as \mbox{ApisTox}, demonstrates their limited generalizability of current state-of-the-art algorithms trained solely on biomedical data. Our study highlights the need for more diverse datasets and for targeted model development geared toward the agrochemical domain.
