RE-RecSys: An End-to-End system for recommending properties in Real-Estate domain
Venkatesh C, Harshit Oberoi, Anil Goyal, Nikhil Sikka
TL;DR
RE-RecSys addresses the challenge of recommending real-estate properties to a diverse user base with infrequent purchases by deploying an end-to-end pipeline that classifies users into four histories-based categories. It combines a rule-based cold-start component, a content-based filter for short-term users, a collaborative filtering module for long-term users, and a hybrid approach for short-long-term users, all with a production-friendly latency. The system is trained and evaluated on a large India-scale real-estate dataset, using metrics like MAP@K and NDCG, and demonstrates deployability with latencies under 40 ms at 1000 rpm. The work provides practical guidance on data windows, weighting schemes for impressions, and cadence of retraining, enabling real-world deployment and scalable personalization in high-traffic real-estate platforms.
Abstract
We propose an end-to-end real-estate recommendation system, RE-RecSys, which has been productionized in real-world industry setting. We categorize any user into 4 categories based on available historical data: i) cold-start users; ii) short-term users; iii) long-term users; and iv) short-long term users. For cold-start users, we propose a novel rule-based engine that is based on the popularity of locality and user preferences. For short-term users, we propose to use content-filtering model which recommends properties based on recent interactions of users. For long-term and short-long term users, we propose a novel combination of content and collaborative filtering based approach which can be easily productionized in the real-world scenario. Moreover, based on the conversion rate, we have designed a novel weighing scheme for different impressions done by users on the platform for the training of content and collaborative models. Finally, we show the efficiency of the proposed pipeline, RE-RecSys, on a real-world property and clickstream dataset collected from leading real-estate platform in India. We show that the proposed pipeline is deployable in real-world scenario with an average latency of <40 ms serving 1000 rpm.
