Table of Contents
Fetching ...

A Prescriptive Framework for Determining Optimal Days for Short-Term Traffic Counts

Arthur Mukwaya, Nancy Kasamala, Nana Kankam Gyimah, Judith Mwakalonge, Gurcan Comert, Saidi Siuhi, Denis Ruganuza, Mark Ngotonie

TL;DR

This paper addresses how to improve AADT estimation on unmonitored roads by selecting the most informative short-term count days. It introduces a data-driven prescriptive framework that uses Texas 2022–2023 CC and SC data, feature engineering with leave-one-out averaging and clustering, and an iterative XGBoost-based modeling approach to identify optimal count days. Results show that optimally selected days yield substantial reductions in RMSE and MAE and higher R^2 compared with baselines that average across days, with Day 186 often achieving strong performance. The approach offers a scalable, cost-effective means for DOTs to enhance AADT accuracy and comply with Highway Performance Monitoring System requirements while reducing statewide data collection costs.

Abstract

The Federal Highway Administration (FHWA) mandates that state Departments of Transportation (DOTs) collect reliable Annual Average Daily Traffic (AADT) data. However, many U.S. DOTs struggle to obtain accurate AADT, especially for unmonitored roads. While continuous count (CC) stations offer accurate traffic volume data, their implementation is expensive and difficult to deploy widely, compelling agencies to rely on short-duration traffic counts. This study proposes a machine learning framework, the first to our knowledge, to identify optimal representative days for conducting short count (SC) data collection to improve AADT prediction accuracy. Using 2022 and 2023 traffic volume data from the state of Texas, we compare two scenarios: an 'optimal day' approach that iteratively selects the most informative days for AADT estimation and a 'no optimal day' baseline reflecting current practice by most DOTs. To align with Texas DOT's traffic monitoring program, continuous count data were utilized to simulate the 24 hour short counts. The actual field short counts were used to enhance feature engineering through using a leave-one-out (LOO) technique to generate unbiased representative daily traffic features across similar road segments. Our proposed methodology outperforms the baseline across the top five days, with the best day (Day 186) achieving lower errors (RMSE: 7,871.15, MAE: 3,645.09, MAPE: 11.95%) and higher R^2 (0.9756) than the baseline (RMSE: 11,185.00, MAE: 5,118.57, MAPE: 14.42%, R^2: 0.9499). This research offers DOTs an alternative to conventional short-duration count practices, improving AADT estimation, supporting Highway Performance Monitoring System compliance, and reducing the operational costs of statewide traffic data collection.

A Prescriptive Framework for Determining Optimal Days for Short-Term Traffic Counts

TL;DR

This paper addresses how to improve AADT estimation on unmonitored roads by selecting the most informative short-term count days. It introduces a data-driven prescriptive framework that uses Texas 2022–2023 CC and SC data, feature engineering with leave-one-out averaging and clustering, and an iterative XGBoost-based modeling approach to identify optimal count days. Results show that optimally selected days yield substantial reductions in RMSE and MAE and higher R^2 compared with baselines that average across days, with Day 186 often achieving strong performance. The approach offers a scalable, cost-effective means for DOTs to enhance AADT accuracy and comply with Highway Performance Monitoring System requirements while reducing statewide data collection costs.

Abstract

The Federal Highway Administration (FHWA) mandates that state Departments of Transportation (DOTs) collect reliable Annual Average Daily Traffic (AADT) data. However, many U.S. DOTs struggle to obtain accurate AADT, especially for unmonitored roads. While continuous count (CC) stations offer accurate traffic volume data, their implementation is expensive and difficult to deploy widely, compelling agencies to rely on short-duration traffic counts. This study proposes a machine learning framework, the first to our knowledge, to identify optimal representative days for conducting short count (SC) data collection to improve AADT prediction accuracy. Using 2022 and 2023 traffic volume data from the state of Texas, we compare two scenarios: an 'optimal day' approach that iteratively selects the most informative days for AADT estimation and a 'no optimal day' baseline reflecting current practice by most DOTs. To align with Texas DOT's traffic monitoring program, continuous count data were utilized to simulate the 24 hour short counts. The actual field short counts were used to enhance feature engineering through using a leave-one-out (LOO) technique to generate unbiased representative daily traffic features across similar road segments. Our proposed methodology outperforms the baseline across the top five days, with the best day (Day 186) achieving lower errors (RMSE: 7,871.15, MAE: 3,645.09, MAPE: 11.95%) and higher R^2 (0.9756) than the baseline (RMSE: 11,185.00, MAE: 5,118.57, MAPE: 14.42%, R^2: 0.9499). This research offers DOTs an alternative to conventional short-duration count practices, improving AADT estimation, supporting Highway Performance Monitoring System compliance, and reducing the operational costs of statewide traffic data collection.

Paper Structure

This paper contains 34 sections, 5 equations, 8 figures, 3 tables.

Figures (8)

  • Figure 1: Distribution of active permanent count stations in Texas (2023) by functional class and area type, highlighting the absence of coverage on local roads and limited representation in lower-priority urban areas.
  • Figure 2: Proposed workflow for optimal short-duration traffic count day selection showing (1) Data preprocessing with wide format conversion, daily count aggregation, missing value imputation and day filtering, (2) Feature engineering using roadway attributes, clustering, and leave-one-out averaging and (3) XGBoost modeling with performance evaluation using RMSE, MAE, MAPE, and R² metrics.
  • Figure 3: Study area of Texas State showing Permanent Stations and Short Count Stations in 2022 and 2023.
  • Figure 4: Distribution of short duration traffic volume counts in Texas (2022–2023) across (A) Months, (B) Weekdays, and (C) Functional class.
  • Figure 5: XGBoost and Random forest performance comparison for imputing missing traffic counts evaluated on a 10% held-out sample.
  • ...and 3 more figures