Benchmarking Classical, Deep, and Generative Models for Human Activity Recognition
Md Meem Hossain, The Anh Han, Safina Showkat Ara, Zia Ush Shamszaman
TL;DR
This paper addresses HAR by benchmarking classical ML, deep learning, and Restricted Boltzmann Machine methods across five standard datasets (UCI-HAR, OPPORTUNITY, PAMAP2, WISDM, Berkeley MHAD). It evaluates a broad model set including DT, RF, SVM variants, CNN, RNN/LSTM/BiLSTM/GRU, ANN, DBNs, and DBMs using accuracy, precision, recall, and F1-score. Key findings show CNNs consistently outperform others, especially on complex datasets like Berkeley MHAD, while RF excels on smaller datasets and RBMs offer strong feature-learning potential, sometimes approaching deep models. These results provide practical guidance for selecting HAR models based on dataset size, modality, and computational constraints, and highlight RBMs as a viable path for feature learning and cross-domain HAR exploration.
Abstract
Human Activity Recognition (HAR) has gained significant importance with the growing use of sensor-equipped devices and large datasets. This paper evaluates the performance of three categories of models : classical machine learning, deep learning architectures, and Restricted Boltzmann Machines (RBMs) using five key benchmark datasets of HAR (UCI-HAR, OPPORTUNITY, PAMAP2, WISDM, and Berkeley MHAD). We assess various models, including Decision Trees, Random Forests, Convolutional Neural Networks (CNN), and Deep Belief Networks (DBNs), using metrics such as accuracy, precision, recall, and F1-score for a comprehensive comparison. The results show that CNN models offer superior performance across all datasets, especially on the Berkeley MHAD. Classical models like Random Forest do well on smaller datasets but face challenges with larger, more complex data. RBM-based models also show notable potential, particularly for feature learning. This paper offers a detailed comparison to help researchers choose the most suitable model for HAR tasks.
