Investigating the Impact of Randomness on Reproducibility in Computer Vision: A Study on Applications in Civil Engineering and Medicine

Bahadır Eryılmaz; Osman Alperen Koraş; Jörg Schlötterer; Christin Seifert

Investigating the Impact of Randomness on Reproducibility in Computer Vision: A Study on Applications in Civil Engineering and Medicine

Bahadır Eryılmaz, Osman Alperen Koraş, Jörg Schlötterer, Christin Seifert

TL;DR

This investigation focuses on CUDA-induced randomness across one standard benchmark dataset and two real-world datasets in an isolated environment and finds that managing this variability for reproducibility may entail increased runtime or reduce performance, but that disadvantages are not as significant as reported in previous studies.

Abstract

Reproducibility is essential for scientific research. However, in computer vision, achieving consistent results is challenging due to various factors. One influential, yet often unrecognized, factor is CUDA-induced randomness. Despite CUDA's advantages for accelerating algorithm execution on GPUs, if not controlled, its behavior across multiple executions remains non-deterministic. While reproducibility issues in ML being researched, the implications of CUDA-induced randomness in application are yet to be understood. Our investigation focuses on this randomness across one standard benchmark dataset and two real-world datasets in an isolated environment. Our results show that CUDA-induced randomness can account for differences up to 4.77% in performance scores. We find that managing this variability for reproducibility may entail increased runtime or reduce performance, but that disadvantages are not as significant as reported in previous studies.

Investigating the Impact of Randomness on Reproducibility in Computer Vision: A Study on Applications in Civil Engineering and Medicine

TL;DR

Abstract

Paper Structure (9 sections, 4 figures, 8 tables)

This paper contains 9 sections, 4 figures, 8 tables.

Introduction
Related Work
Methodology
Results
Environmental Impact of the Experiments
Discussion and Conclusion
Appendix
Reproducing the Results
Additional Tables and Graphs from the Experiment Data

Figures (4)

Figure 1: Experimentation setup. 20, 15 and 10 runs for CIFAR, SDNET and CBIS-DDSM, respectively per seed and optimizer in the non-deterministic setup, 480 runs in total.
Figure 2: Cosine similarity of embeddings in the last linear layer of the models with respect to epoch number. Showing max, min, and mean cosine similarity values with Adam optimizer for each dataset (rows) and seed (column).
Figure 3: Cosine similarity of embeddings in the last linear layer of the models with respect to epoch number. Showing max, min, and mean cosine similarity values with SGD optimizer for each dataset (rows) and seed (column).
Figure 4: Mean standard deviation plots with respect to epoch number for each optimizer and dataset across five different seed configurations. The figures display the mean standard deviation in the classification head for each model and its progression over the training course. The comparison between the SGD and ADAM optimizers is highlighted for three datasets: CIFAR-10, CBIS-DDSM, and SDNET. The plots reveal the trends in standard deviations, showcasing how the randomness introduced by CUDA affects the consistency of model performance. Notably, the figures demonstrate how certain seed values contribute to greater fluctuations, with the ADAM optimizer often exhibiting more pronounced variances compared to SGD, particularly in the later epochs.

Investigating the Impact of Randomness on Reproducibility in Computer Vision: A Study on Applications in Civil Engineering and Medicine

TL;DR

Abstract

Investigating the Impact of Randomness on Reproducibility in Computer Vision: A Study on Applications in Civil Engineering and Medicine

Authors

TL;DR

Abstract

Table of Contents

Figures (4)