Table of Contents
Fetching ...

Understanding U.S. Racial Segregation Through Persistent Homology

Ori Friesen, Lori Ziegelmeier

Abstract

Racial segregation is a widespread social and physical phenomenon present in every city across the United States. Although prevalent nationwide, each city has a unique history of racial segregation, resulting in distinct "shapes" of segregation. We use persistent homology, a technique from applied algebraic topology, to investigate whether common patterns of racial segregation exist among U.S. cities. We explore two methods of constructing simplicial complexes that preserve geospatial data, applying them to White, Black, Asian, and Hispanic demographic data from the U.S. census for 112 U.S. cities. Using these methods, we cluster the cities based on their persistence to identify groups with similar segregation "shapes". Finally, we apply cluster analysis techniques to explore the characteristics of our clusters. This includes calculating the mean cluster statistics to gain insights into the demographics of each cluster and using the Adjusted Rand Index to compare our results with other clustering methods.

Understanding U.S. Racial Segregation Through Persistent Homology

Abstract

Racial segregation is a widespread social and physical phenomenon present in every city across the United States. Although prevalent nationwide, each city has a unique history of racial segregation, resulting in distinct "shapes" of segregation. We use persistent homology, a technique from applied algebraic topology, to investigate whether common patterns of racial segregation exist among U.S. cities. We explore two methods of constructing simplicial complexes that preserve geospatial data, applying them to White, Black, Asian, and Hispanic demographic data from the U.S. census for 112 U.S. cities. Using these methods, we cluster the cities based on their persistence to identify groups with similar segregation "shapes". Finally, we apply cluster analysis techniques to explore the characteristics of our clusters. This includes calculating the mean cluster statistics to gain insights into the demographics of each cluster and using the Adjusted Rand Index to compare our results with other clustering methods.

Paper Structure

This paper contains 17 sections, 7 equations, 12 figures, 2 tables.

Figures (12)

  • Figure 1: Racial distribution maps of Milwaukee, WI (left), and Atlanta, GA (right), majority White areas depicted in blue, majority Black areas in green, majority Hispanic areas in yellow, and majority Asian areas in red. Source: bestneighborhood.
  • Figure 2: The procedure for transforming the city's shapefile (left) into a TIFF image (right) for the Chicago-White pair. The shapefile displays census tract boundaries, with those shaded in black representing areas with a majority White population, while the white-colored tracts have a different demographic majority. For the TIFF image, we convert and include only the tracts that are colored black in the shapefile.
  • Figure 3: An example of developing a persistence barcode from the level-set boundary expansion of a manifold using the Chicago-White census tracts. The images above the barcode show how the purple manifolds and the yellow empty space develop throughout the level-set process. We then find the persistence of the connected components ($H_0$ features) and topological loops ($H_1$ features) and represent it as a persistence barcode.
  • Figure 4: The grayscale image of the Chicago-White city-race pair corresponding to the cubical complex filtration. Here, the grayscale image acts as a choropleth map for the White population of Chicago, where yellow census tracts are areas with a lower White population percentage and blue ones represent higher population percentages.
  • Figure 5: An example of developing persistence through cubical complexes for the Chicago-White city-race pair. The images above the barcode are the census tracts corresponding to the respective $\varepsilon$ value. Compared to the barcode in Figure \ref{['fig:levelsetbarcode']}, the cubical persistence barcode contains many more features. Additionally, while $H_0$ features were only born at $t=0$ for level-set persistence, with cubical persistence, $H_0$ features can be born at any $\varepsilon$.
  • ...and 7 more figures