Table of Contents
Fetching ...

Randomness control and reproducibility study of random forest algorithm in R and Python

Louisa Camadini, Yanis Bouzid, Maeva Merlet, Léopold Carron

TL;DR

This study addresses the reproducibility of random forest models across R and Python implementations by comparing four packages (randomForest and Ranger in R; SKRanger and Scikit-Learn in Python) under controlled PRNG conditions. It shows that bootstrap sampling and random feature selection introduce variability but that exact reproducibility is attainable by standardizing seeds, disabling bootstrap, using all features, and aligning parameter semantics (notably min_node_size and nodesize) and aggregation methods. The authors identify three salient findings—randomness in splitting variable selection, the critical role of min_node_size for SKRanger, and the interpretation mismatch of nodesize versus min_samples_split—and demonstrate how to achieve identical trees and classifications across implementations. The work has practical implications for regulatory risk assessment, offering a concrete procedure to ensure reproducible RF results across commonly used languages and packages when evaluating cosmetic product safety.

Abstract

When it comes to the safety of cosmetic products, compliance with regulatory standards is crucialto guarantee consumer protection against the risks of skin irritation. Toxicologists must thereforebe fully conversant with all risks. This applies not only to their day-to-day work, but also to allthe algorithms they integrate into their routines. Recognizing this, ensuring the reproducibility ofalgorithms becomes one of the most crucial aspects to address.However, how can we prove the robustness of an algorithm such as the random forest, that reliesheavily on randomness? In this report, we will discuss the strategy of integrating random forest intoocular tolerance assessment for toxicologists.We will compare four packages: randomForest and Ranger (R packages), adapted in Python via theSKRanger package, and the widely used Scikit-Learn with the RandomForestClassifier() function.Our goal is to investigate the parameters and sources of randomness affecting the outcomes ofRandom Forest algorithms.By setting comparable parameters and using the same Pseudo-Random Number Generator (PRNG),we expect to reproduce results consistently across the various available implementations of therandom forest algorithm. Nevertheless, this exploration will unveil hidden layers of randomness andguide our understanding of the critical parameters necessary to ensure reproducibility across all fourimplementations of the random forest algorithm.

Randomness control and reproducibility study of random forest algorithm in R and Python

TL;DR

This study addresses the reproducibility of random forest models across R and Python implementations by comparing four packages (randomForest and Ranger in R; SKRanger and Scikit-Learn in Python) under controlled PRNG conditions. It shows that bootstrap sampling and random feature selection introduce variability but that exact reproducibility is attainable by standardizing seeds, disabling bootstrap, using all features, and aligning parameter semantics (notably min_node_size and nodesize) and aggregation methods. The authors identify three salient findings—randomness in splitting variable selection, the critical role of min_node_size for SKRanger, and the interpretation mismatch of nodesize versus min_samples_split—and demonstrate how to achieve identical trees and classifications across implementations. The work has practical implications for regulatory risk assessment, offering a concrete procedure to ensure reproducible RF results across commonly used languages and packages when evaluating cosmetic product safety.

Abstract

When it comes to the safety of cosmetic products, compliance with regulatory standards is crucialto guarantee consumer protection against the risks of skin irritation. Toxicologists must thereforebe fully conversant with all risks. This applies not only to their day-to-day work, but also to allthe algorithms they integrate into their routines. Recognizing this, ensuring the reproducibility ofalgorithms becomes one of the most crucial aspects to address.However, how can we prove the robustness of an algorithm such as the random forest, that reliesheavily on randomness? In this report, we will discuss the strategy of integrating random forest intoocular tolerance assessment for toxicologists.We will compare four packages: randomForest and Ranger (R packages), adapted in Python via theSKRanger package, and the widely used Scikit-Learn with the RandomForestClassifier() function.Our goal is to investigate the parameters and sources of randomness affecting the outcomes ofRandom Forest algorithms.By setting comparable parameters and using the same Pseudo-Random Number Generator (PRNG),we expect to reproduce results consistently across the various available implementations of therandom forest algorithm. Nevertheless, this exploration will unveil hidden layers of randomness andguide our understanding of the critical parameters necessary to ensure reproducibility across all fourimplementations of the random forest algorithm.
Paper Structure (19 sections, 1 equation, 2 figures, 5 tables, 1 algorithm)