Table of Contents
Fetching ...

Optimizing Urban Mobility Through Complex Network Analysis and Big Data from Smart Cards

Li Sun, Negin Ashrafi, Maryam Pishgar

TL;DR

The paper addresses how high-frequency and low-frequency travelers shape urban public transit networks using Beijing's smart-card data. It proposes a scalable pipeline that combines data preprocessing, station clustering, and complex-network analysis, including HF/LF network construction, visualization, and robustness/peak-hour assessments. Key findings show HF networks are highly connected but less robust, LF networks are more dispersed and resilient, and peak hours induce congestion with longer path lengths; strategies include diversifying HF routes and enhancing LF accessibility. The work advances a scalable framework for analyzing large-scale AFC data to inform transit planning, robustness enhancement, and sustainable urban mobility.

Abstract

This study investigates the network characteristics of high-frequency (HF) and low-frequency (LF) travelers in urban public transport systems by analyzing 20 million smart card records from Beijing's transit network. A novel methodology integrates advanced data preprocessing, clustering techniques, and complex network analysis to differentiate HF and LF passenger behaviors and their impacts on network structure, robustness, and efficiency. The primary challenge is accurately segmenting and modeling the behaviors of diverse passenger groups within a large-scale, noisy dataset while maintaining computational efficiency and scalability. HF networks, representing the top 25% of travelers by usage frequency, exhibit high connectivity with an average clustering coefficient of 0.72 and greater node degree centrality. However, they have lower robustness, with efficiency declining by 35% under targeted disruptions and longer average path lengths of 6.2 during peak hours. In contrast, LF networks, which include 75% of travelers, are more dispersed yet resilient, with efficiency declining by only 10% under similar disruptions and stronger intracommunity connectivity. Temporal analysis reveals that HF passengers significantly contribute to peak-hour congestion, with 57.4% of HF trips occurring between 6:00 and 10:00 AM, while LF passengers show a broader temporal distribution, helping to mitigate congestion hotspots. Understanding these travel patterns is crucial for optimizing public transit systems. The findings suggest targeted strategies such as enhancing robustness in HF networks by diversifying key routes and improving accessibility in LF-dominated areas. This research provides a scalable framework for analyzing smart card data and offers actionable insights for optimizing transit networks, improving congestion management, and advancing sustainable urban mobility planning.

Optimizing Urban Mobility Through Complex Network Analysis and Big Data from Smart Cards

TL;DR

The paper addresses how high-frequency and low-frequency travelers shape urban public transit networks using Beijing's smart-card data. It proposes a scalable pipeline that combines data preprocessing, station clustering, and complex-network analysis, including HF/LF network construction, visualization, and robustness/peak-hour assessments. Key findings show HF networks are highly connected but less robust, LF networks are more dispersed and resilient, and peak hours induce congestion with longer path lengths; strategies include diversifying HF routes and enhancing LF accessibility. The work advances a scalable framework for analyzing large-scale AFC data to inform transit planning, robustness enhancement, and sustainable urban mobility.

Abstract

This study investigates the network characteristics of high-frequency (HF) and low-frequency (LF) travelers in urban public transport systems by analyzing 20 million smart card records from Beijing's transit network. A novel methodology integrates advanced data preprocessing, clustering techniques, and complex network analysis to differentiate HF and LF passenger behaviors and their impacts on network structure, robustness, and efficiency. The primary challenge is accurately segmenting and modeling the behaviors of diverse passenger groups within a large-scale, noisy dataset while maintaining computational efficiency and scalability. HF networks, representing the top 25% of travelers by usage frequency, exhibit high connectivity with an average clustering coefficient of 0.72 and greater node degree centrality. However, they have lower robustness, with efficiency declining by 35% under targeted disruptions and longer average path lengths of 6.2 during peak hours. In contrast, LF networks, which include 75% of travelers, are more dispersed yet resilient, with efficiency declining by only 10% under similar disruptions and stronger intracommunity connectivity. Temporal analysis reveals that HF passengers significantly contribute to peak-hour congestion, with 57.4% of HF trips occurring between 6:00 and 10:00 AM, while LF passengers show a broader temporal distribution, helping to mitigate congestion hotspots. Understanding these travel patterns is crucial for optimizing public transit systems. The findings suggest targeted strategies such as enhancing robustness in HF networks by diversifying key routes and improving accessibility in LF-dominated areas. This research provides a scalable framework for analyzing smart card data and offers actionable insights for optimizing transit networks, improving congestion management, and advancing sustainable urban mobility planning.

Paper Structure

This paper contains 26 sections, 4 equations, 18 figures, 32 tables.

Figures (18)

  • Figure 1: Flowchart of data preprocessing, clustering, and network analysis processes.
  • Figure 2: A GIS diagram of bus stops and subway stations.
  • Figure 3: A GIS diagram of bus stops and subway stations.
  • Figure 4: Cumulative Frequency by travel counts – week 1 (left) and week 2 (right).
  • Figure 5: Complex Network Construction Network
  • ...and 13 more figures