Table of Contents
Fetching ...

VecLSTM: Trajectory Data Processing and Management for Activity Recognition through LSTM Vectorization and Database Integration

Solmaz Seyed Monir, Dongfang Zhao

TL;DR

VecLSTM tackles scalable trajectory-based activity recognition by introducing a vectorization layer that converts GPS sequences into a structured grid, enabling a CNN-LSTM hybrid to learn both spatial and temporal patterns. The methodology includes integration with a MySQL vector database to support large-scale data management, and a two-LSTM, CNN-based architecture that merges spatial and temporal features for prediction. Experimental results on a GeoLife-derived dataset show substantial improvements in accuracy (validation 85.57%, test 85.47%, weighted F1 0.86) and notable reductions in training time (up to ~74.2% overall when comparing vectorized vs non-vectorized baselines). The work demonstrates strong practical potential for real-time trajectory analysis and scalable trajectory data management, with future work focused on refining vectorization and exploring orthogonal data representations.

Abstract

Activity recognition is a challenging task due to the large scale of trajectory data and the need for prompt and efficient processing. Existing methods have attempted to mitigate this problem by employing traditional LSTM architectures, but these approaches often suffer from inefficiencies in processing large datasets. In response to this challenge, we propose VecLSTM, a novel framework that enhances the performance and efficiency of LSTM-based neural networks. Unlike conventional approaches, VecLSTM incorporates vectorization layers, leveraging optimized mathematical operations to process input sequences more efficiently. We have implemented VecLSTM and incorporated it into the MySQL database. To evaluate the effectiveness of VecLSTM, we compare its performance against a conventional LSTM model using a dataset comprising 1,467,652 samples with seven unique labels. Experimental results demonstrate superior accuracy and efficiency compared to the state-of-the-art, with VecLSTM achieving a validation accuracy of 85.57\%, a test accuracy of 85.47\%, and a weighted F1-score of 0.86. Furthermore, VecLSTM significantly reduces training time, offering a 26.2\% reduction compared to traditional LSTM models.

VecLSTM: Trajectory Data Processing and Management for Activity Recognition through LSTM Vectorization and Database Integration

TL;DR

VecLSTM tackles scalable trajectory-based activity recognition by introducing a vectorization layer that converts GPS sequences into a structured grid, enabling a CNN-LSTM hybrid to learn both spatial and temporal patterns. The methodology includes integration with a MySQL vector database to support large-scale data management, and a two-LSTM, CNN-based architecture that merges spatial and temporal features for prediction. Experimental results on a GeoLife-derived dataset show substantial improvements in accuracy (validation 85.57%, test 85.47%, weighted F1 0.86) and notable reductions in training time (up to ~74.2% overall when comparing vectorized vs non-vectorized baselines). The work demonstrates strong practical potential for real-time trajectory analysis and scalable trajectory data management, with future work focused on refining vectorization and exploring orthogonal data representations.

Abstract

Activity recognition is a challenging task due to the large scale of trajectory data and the need for prompt and efficient processing. Existing methods have attempted to mitigate this problem by employing traditional LSTM architectures, but these approaches often suffer from inefficiencies in processing large datasets. In response to this challenge, we propose VecLSTM, a novel framework that enhances the performance and efficiency of LSTM-based neural networks. Unlike conventional approaches, VecLSTM incorporates vectorization layers, leveraging optimized mathematical operations to process input sequences more efficiently. We have implemented VecLSTM and incorporated it into the MySQL database. To evaluate the effectiveness of VecLSTM, we compare its performance against a conventional LSTM model using a dataset comprising 1,467,652 samples with seven unique labels. Experimental results demonstrate superior accuracy and efficiency compared to the state-of-the-art, with VecLSTM achieving a validation accuracy of 85.57\%, a test accuracy of 85.47\%, and a weighted F1-score of 0.86. Furthermore, VecLSTM significantly reduces training time, offering a 26.2\% reduction compared to traditional LSTM models.
Paper Structure (27 sections, 9 equations, 4 figures, 2 tables, 4 algorithms)

This paper contains 27 sections, 9 equations, 4 figures, 2 tables, 4 algorithms.

Figures (4)

  • Figure 1: VecLSTM Efficiency Enhancement Proposal: The VecLSTM framework optimizes efficiency through vectorization and streamlined vector database operations. It integrates advanced vectorization methodologies to refine data representation, expediting computations and model training. Additionally, optimized vector database techniques reduce query durations, strengthening system efficiency.
  • Figure 2: Comparison of model performance and ROC curve.
  • Figure 3: Illustrating the performance of the activity recognition model across different stages of preprocessing and modeling. (a) Before vectorization, (b) After vectorization, and (c) Hybrid model with vectorized input data. The combined model with vectorized input data shows superior performance.
  • Figure 4: Comparison of RMSE, MAE, and MSE metrics for four models. The proposed model exhibits the lowest RMSE and MSE values, indicating superior accuracy.