Utilizing Model Residuals to Identify Rental Properties of Interest: The Price Anomaly Score (PAS) and Its Application to Real-time Data in Manhattan
Youssef Sultan, Jackson C. Rafter, Huyen T. Nguyen
TL;DR
This study addresses the challenge of identifying overpriced and underpriced Manhattan rental properties by leveraging real-time StreetEasy listings and residual-based analysis. It introduces the Price Anomaly Score (PAS), a metric that combines relative price deviation with standardized residuals to classify properties into Overpriced, Underpriced, or Fair-priced, and demonstrates how PAS can outperform traditional Z-score alone in capturing pricing anomalies. The approach uses XGBoost with SHAP for feature importance, achieves an $R^2$ around 0.79, and provides a practical framework for dynamic pricing insights in a volatile market. While promising, the work acknowledges limitations such as StreetEasy-only data, market dynamics, and subjective thresholding, and outlines future directions including external factors and real-time, adaptive modeling.
Abstract
Understanding whether a property is priced fairly hinders buyers and sellers since they usually do not have an objective viewpoint of the price distribution for the overall market of their interest. Drawing from data collected of all possible available properties for rent in Manhattan as of September 2023, this paper aims to strengthen our understanding of model residuals; specifically on machine learning models which generalize for a majority of the distribution of a well-proportioned dataset. Most models generally perceive deviations from predicted values as mere inaccuracies, however this paper proposes a different vantage point: when generalizing to at least 75\% of the data-set, the remaining deviations reveal significant insights. To harness these insights, we introduce the Price Anomaly Score (PAS), a metric capable of capturing boundaries between irregularly predicted prices. By combining relative pricing discrepancies with statistical significance, the Price Anomaly Score (PAS) offers a multifaceted view of rental valuations. This metric allows experts to identify overpriced or underpriced properties within a dataset by aggregating PAS values, then fine-tuning upper and lower boundaries to any threshold to set indicators of choice.
