Table of Contents
Fetching ...

Hidden markov model to predict tourists visited place

Theo Demessance, Chongke Bi, Sonia Djebali, Guillaume Guerard

TL;DR

This paper tackles predicting tourist movements from geo-located social data by combining grammatical inference with Hidden Markov Models (HMMs) in a scalable pipeline designed for big data. It introduces a Frequency Prefix Tree to capture sequence structure, applies Relaxed Alergia to merge compatible states, and converts the resulting frequency automaton into an HMM, with predictions derived via Viterbi and updates via Baum-Welch. The key contributions include the GI-based learning framework for large sequence sets, the Relaxed Alergia compatibility mechanism, and an updatable HMM that reflects new data while preserving behavioral possibilities, demonstrated on a Paris case study with significant improvements after updating (MAPE reductions from around 20.8% to below 10%). The approach provides a practical, adaptable tool for tourism marketing and decision support, enabling targeted recommendations and scenario modeling across tourist groups. Future work aims to tailor models to demographic profiles and to benchmark against deep learning methods.

Abstract

Nowadays, social networks are becoming a popular way of analyzing tourist behavior, thanks to the digital traces left by travelers during their stays on these networks. The massive amount of data generated; by the propensity of tourists to share comments and photos during their trip; makes it possible to model their journeys and analyze their behavior. Predicting the next movement of tourists plays a key role in tourism marketing to understand demand and improve decision support. In this paper, we propose a method to understand and to learn tourists' movements based on social network data analysis to predict future movements. The method relies on a machine learning grammatical inference algorithm. A major contribution in this paper is to adapt the grammatical inference algorithm to the context of big data. Our method produces a hidden Markov model representing the movements of a group of tourists. The hidden Markov model is flexible and editable with new data. The capital city of France, Paris is selected to demonstrate the efficiency of the proposed methodology.

Hidden markov model to predict tourists visited place

TL;DR

This paper tackles predicting tourist movements from geo-located social data by combining grammatical inference with Hidden Markov Models (HMMs) in a scalable pipeline designed for big data. It introduces a Frequency Prefix Tree to capture sequence structure, applies Relaxed Alergia to merge compatible states, and converts the resulting frequency automaton into an HMM, with predictions derived via Viterbi and updates via Baum-Welch. The key contributions include the GI-based learning framework for large sequence sets, the Relaxed Alergia compatibility mechanism, and an updatable HMM that reflects new data while preserving behavioral possibilities, demonstrated on a Paris case study with significant improvements after updating (MAPE reductions from around 20.8% to below 10%). The approach provides a practical, adaptable tool for tourism marketing and decision support, enabling targeted recommendations and scenario modeling across tourist groups. Future work aims to tailor models to demographic profiles and to benchmark against deep learning methods.

Abstract

Nowadays, social networks are becoming a popular way of analyzing tourist behavior, thanks to the digital traces left by travelers during their stays on these networks. The massive amount of data generated; by the propensity of tourists to share comments and photos during their trip; makes it possible to model their journeys and analyze their behavior. Predicting the next movement of tourists plays a key role in tourism marketing to understand demand and improve decision support. In this paper, we propose a method to understand and to learn tourists' movements based on social network data analysis to predict future movements. The method relies on a machine learning grammatical inference algorithm. A major contribution in this paper is to adapt the grammatical inference algorithm to the context of big data. Our method produces a hidden Markov model representing the movements of a group of tourists. The hidden Markov model is flexible and editable with new data. The capital city of France, Paris is selected to demonstrate the efficiency of the proposed methodology.

Paper Structure

This paper contains 20 sections, 1 equation, 5 figures, 2 algorithms.

Figures (5)

  • Figure 1: Example of merging two stays.
  • Figure 2: Merge and fold operations of the Relaxed Alergia algorithm.
  • Figure 3: Changing a stochastic automaton into an HMM.
  • Figure 4: HMM's validation.
  • Figure 5: HMM's predictions.

Theorems & Definitions (4)

  • Definition 1: Compatibility test
  • Definition 2: Relative Frequency
  • Definition 3: Merge operation
  • Definition 4: Fold operation