Table of Contents
Fetching ...

Forecasting the Evolving Composition of Inbound Tourism Demand: A Bayesian Compositional Time Series Approach Using Platform Booking Data

Harrison Katz

Abstract

Understanding how the composition of guest origin markets evolves over time is critical for destination marketing organizations, hospitality businesses, and tourism planners. We develop and apply Bayesian Dirichlet autoregressive moving average (BDARMA) models to forecast the compositional dynamics of guest origin market shares using proprietary Airbnb booking data spanning 2017--2024 across four major destination regions. Our analysis reveals substantial pandemic-induced structural breaks in origin composition, with heterogeneous recovery patterns across markets. The BDARMA framework achieves the lowest average forecast error across all destination regions, outperforming standard benchmarks including naïve forecasts, exponential smoothing, and SARIMA on log-ratio transformed data. For EMEA destinations, BDARMA achieves 23% lower forecast error than naive methods, with statistically significant improvements. By modeling compositions directly on the simplex with a Dirichlet likelihood and incorporating seasonal variation in both mean and precision parameters, our approach produces coherent forecasts that respect the unit-sum constraint while capturing complex temporal dependencies. The methodology provides destination stakeholders with probabilistic forecasts of source market shares, enabling more informed strategic planning for marketing resource allocation, infrastructure investment, and crisis response.

Forecasting the Evolving Composition of Inbound Tourism Demand: A Bayesian Compositional Time Series Approach Using Platform Booking Data

Abstract

Understanding how the composition of guest origin markets evolves over time is critical for destination marketing organizations, hospitality businesses, and tourism planners. We develop and apply Bayesian Dirichlet autoregressive moving average (BDARMA) models to forecast the compositional dynamics of guest origin market shares using proprietary Airbnb booking data spanning 2017--2024 across four major destination regions. Our analysis reveals substantial pandemic-induced structural breaks in origin composition, with heterogeneous recovery patterns across markets. The BDARMA framework achieves the lowest average forecast error across all destination regions, outperforming standard benchmarks including naïve forecasts, exponential smoothing, and SARIMA on log-ratio transformed data. For EMEA destinations, BDARMA achieves 23% lower forecast error than naive methods, with statistically significant improvements. By modeling compositions directly on the simplex with a Dirichlet likelihood and incorporating seasonal variation in both mean and precision parameters, our approach produces coherent forecasts that respect the unit-sum constraint while capturing complex temporal dependencies. The methodology provides destination stakeholders with probabilistic forecasts of source market shares, enabling more informed strategic planning for marketing resource allocation, infrastructure investment, and crisis response.
Paper Structure (34 sections, 7 equations, 8 figures, 7 tables)

This paper contains 34 sections, 7 equations, 8 figures, 7 tables.

Figures (8)

  • Figure 1: Guest origin market shares by destination region, January 2017--December 2024. Stacked area charts show the compositional evolution of the top seven origin markets plus "Other" for each destination. The vertical dashed line indicates March 2020 (onset of COVID-19 pandemic). Note the dramatic compositional shifts during 2020--2021, particularly the collapse of Chinese outbound travel to APAC and the surge in within-region bookings across all destinations.
  • Figure 2: Market concentration over time by destination region. The Herfindahl-Hirschman Index (HHI) measures origin market concentration, with higher values indicating dominance by fewer markets. NAMER exhibits consistently high concentration due to U.S. dominance, while EMEA maintains diverse origin portfolios. The vertical dashed line indicates March 2020.
  • Figure 3: Average autocorrelation of CLR-transformed origin shares by destination region. All regions show substantial persistence, with NAMER exhibiting the slowest decay. The seasonal bump at lag 12 for APAC and LATAM motivates including Fourier terms for seasonality.
  • Figure 4: Seasonal pattern in compositional deviation for EMEA. Boxplots show Aitchison distance from mean composition by calendar month; red diamonds indicate monthly means. Spring and early summer months exhibit greater compositional dispersion than autumn, motivating the seasonal precision specification in our BDARMA models.
  • Figure 5: Model comparison via leave-one-out cross-validation for EMEA. Points indicate posterior mean ELPD; error bars show $\pm 2$ standard errors. The BDARMA(1,1) specification achieves the highest expected log predictive density.
  • ...and 3 more figures