Table of Contents
Fetching ...

Support Vector Machine-Based Burnout Risk Prediction with an Interactive Interface for Organizational Use

Bruno W. G. Teodosio, Mário J. O. T. Lira, Pedro H. M. Araújo, Lucas R. C. Farias

TL;DR

The paper addresses predicting employee burnout risk from workplace and mental health features using supervised learning on the HackerEarth burnout dataset. It compares three regression models—KNN, Random Forest, and SVM with an RBF kernel—using 30-fold cross-validation, finding the SVM achieves the highest $R^2$ of $0.8479$ and outperforms the others via paired $t$-tests. A practical contribution is a Streamlit-based no-code interface that non-technical users can use to obtain real-time burnout risk predictions, enabling data-driven interventions in organizations. The work demonstrates the potential of ML for proactive mental health management at work while acknowledging normalization sensitivity and hyperparameter tuning as areas for refinement; future work includes exploring additional features and explainability to deepen actionable insights.

Abstract

Burnout is a psychological syndrome marked by emotional exhaustion, depersonalization, and reduced personal accomplishment, with a significant impact on individual well-being and organizational performance. This study proposes a machine learning approach to predict burnout risk using the HackerEarth Employee Burnout Challenge dataset. Three supervised algorithms were evaluated: nearest neighbors (KNN), random forest, and support vector machine (SVM), with model performance evaluated through 30-fold cross-validation using the determination coefficient (R2). Among the models tested, SVM achieved the highest predictive performance (R2 = 0.84) and was statistically superior to KNN and Random Forest based on paired $t$-tests. To ensure practical applicability, an interactive interface was developed using Streamlit, allowing non-technical users to input data and receive burnout risk predictions. The results highlight the potential of machine learning to support early detection of burnout and promote data-driven mental health strategies in organizational settings.

Support Vector Machine-Based Burnout Risk Prediction with an Interactive Interface for Organizational Use

TL;DR

The paper addresses predicting employee burnout risk from workplace and mental health features using supervised learning on the HackerEarth burnout dataset. It compares three regression models—KNN, Random Forest, and SVM with an RBF kernel—using 30-fold cross-validation, finding the SVM achieves the highest of and outperforms the others via paired -tests. A practical contribution is a Streamlit-based no-code interface that non-technical users can use to obtain real-time burnout risk predictions, enabling data-driven interventions in organizations. The work demonstrates the potential of ML for proactive mental health management at work while acknowledging normalization sensitivity and hyperparameter tuning as areas for refinement; future work includes exploring additional features and explainability to deepen actionable insights.

Abstract

Burnout is a psychological syndrome marked by emotional exhaustion, depersonalization, and reduced personal accomplishment, with a significant impact on individual well-being and organizational performance. This study proposes a machine learning approach to predict burnout risk using the HackerEarth Employee Burnout Challenge dataset. Three supervised algorithms were evaluated: nearest neighbors (KNN), random forest, and support vector machine (SVM), with model performance evaluated through 30-fold cross-validation using the determination coefficient (R2). Among the models tested, SVM achieved the highest predictive performance (R2 = 0.84) and was statistically superior to KNN and Random Forest based on paired -tests. To ensure practical applicability, an interactive interface was developed using Streamlit, allowing non-technical users to input data and receive burnout risk predictions. The results highlight the potential of machine learning to support early detection of burnout and promote data-driven mental health strategies in organizational settings.

Paper Structure

This paper contains 25 sections, 8 figures, 2 tables.

Figures (8)

  • Figure 1: Bar chart showing the number of missing values per feature in the dataset. Only Resource Allocation, Mental Fatigue Score, and Burn Rate presented missing data; all other variables were complete.
  • Figure 2: Distribution of the Designation variable, illustrating the frequency of employee positions across hierarchical levels. The discrete pattern reflects structured job roles within the organization.
  • Figure 3: Distribution of the Resource Allocation variable across the dataset. Most values are concentrated between 4 and 6, suggesting a moderate and consistent allocation of resources among employees.
  • Figure 4: Correlation heatmap of key numerical features in the dataset. The highest correlation was observed between Mental Fatigue Score and Burn Rate ($r = 0.94$), suggesting a strong association between psychological strain and burnout risk.
  • Figure 5: Multivariate plot illustrating the relationship between Burn Rate, Mental Fatigue Score, and Resource Allocation. Higher levels of mental fatigue and resource allocation are associated with increased burnout rates, as shown by the upward trend and color gradient.
  • ...and 3 more figures