Analyzing and Optimizing the Distribution of Blood Lead Level Testing for Children in New York City: A Data-Driven Approach
Mohamed Afane, Juntao Chen
TL;DR
Addresses neighborhood-level inequities in BLL testing in NYC by proposing a data-driven distribution of testing resources. It combines $k\text{-medoids}$ clustering to define neighborhood risk profiles with a grid search optimization that tunes testing allocations using weights $p_1$ and $p_2$ and total tests $T$, to maximize detected cases while preserving fairness. A regression-based projection estimates 260,000 tests and compares current (2,860 cases) versus optimized (3,270 cases) outcomes, yielding a $14.3\%$ increase and $p<0.05$ significance. The findings advocate reallocating tests toward higher-risk areas, standardizing the optimized distribution, and implementing awareness campaigns and finer-grained data tracking to improve public health impact.
Abstract
This study investigates blood lead level (BLL) rates and testing among children under six years of age across the 42 neighborhoods in New York City from 2005 to 2021. Despite a citywide general decline in BLL rates, disparities at the neighborhood level persist and are not addressed in the official reports, highlighting the need for this comprehensive analysis. In this paper, we analyze the current BLL testing distribution and cluster the neighborhoods using a k-medoids clustering algorithm. We propose an optimized approach that improves resource allocation efficiency by accounting for case incidences and neighborhood risk profiles using a grid search algorithm. Our findings demonstrate statistically significant improvements in case detection and enhanced fairness by focusing on under-served and high-risk groups. Additionally, we propose actionable recommendations to raise awareness among parents, including outreach at local daycare centers and kindergartens, among other venues.
