Common Challenges of Deep Reinforcement Learning Applications Development: An Empirical Study
Mohammad Mehdi Morovati, Florian Tambon, Mina Taraghi, Amin Nikanjam, Foutse Khomh
TL;DR
This paper addresses the lack of clarity around challenges in developing Deep Reinforcement Learning (DRL) applications by conducting a large-scale empirical study of 927 DRL-related Stack Overflow posts. It builds a bottom-up taxonomy of DRL development challenges, validated through a survey of 65 practitioners and cross-checked with GitHub data, revealing that Comprehension, API usage, and Design are frequent DRL-specific issues, while Parallel processing and DRL libraries/frameworks consume more time to obtain accepted answers. The study highlights the need for better DRL documentation, tooling, and dependency management, proposing avenues such as roadmaps, debugging tools, and pre-configured environments to improve DRL quality and development efficiency. Overall, the taxonomy and validation offer a foundation for targeted support to DRL developers and guide future research on DRL quality assurance and tooling.
Abstract
Machine Learning (ML) is increasingly being adopted in different industries. Deep Reinforcement Learning (DRL) is a subdomain of ML used to produce intelligent agents. Despite recent developments in DRL technology, the main challenges that developers face in the development of DRL applications are still unknown. To fill this gap, in this paper, we conduct a large-scale empirical study of 927 DRL-related posts extracted from Stack Overflow, the most popular Q&A platform in the software community. Through the process of labeling and categorizing extracted posts, we created a taxonomy of common challenges encountered in the development of DRL applications, along with their corresponding popularity levels. This taxonomy has been validated through a survey involving 65 DRL developers. Results show that at least 45% of developers experienced 18 of the 21 challenges identified in the taxonomy. The most frequent source of difficulty during the development of DRL applications are Comprehension, API usage, and Design problems, while Parallel processing, and DRL libraries/frameworks are classified as the most difficult challenges to address, with respect to the time required to receive an accepted answer. We hope that the research community will leverage this taxonomy to develop efficient strategies to address the identified challenges and improve the quality of DRL applications.
