Table of Contents
Fetching ...

FairHome: A Fair Housing and Fair Lending Dataset

Anusha Bagalkotkar, Aveek Karmakar, Gabriel Arnson, Ondrej Linda

TL;DR

FairHome introduces a 75k-example, 9-protected-category dataset for detecting fair housing and lending violations in conversational contexts. By training a BERT-based binary classifier on this data and benchmarking against state-of-the-art LLMs in zero- and few-shot settings, the authors demonstrate strong detection performance (F1 up to 0.91 and recall 0.84), outperforming large models in many scenarios. The dataset is complemented by comprehensive annotation guidelines, augmentation, and a guardrail-oriented evaluation that underscores practical utility for enforcing compliant real-estate dialogues. The work advances fair housing AI by providing a publicly accessible resource, a robust detector, and a path toward safer, more equitable AI-assisted housing decisions, with ongoing plans to expand coverage and refinement.

Abstract

We present a Fair Housing and Fair Lending dataset (FairHome): A dataset with around 75,000 examples across 9 protected categories. To the best of our knowledge, FairHome is the first publicly available dataset labeled with binary labels for compliance risk in the housing domain. We demonstrate the usefulness and effectiveness of such a dataset by training a classifier and using it to detect potential violations when using a large language model (LLM) in the context of real-estate transactions. We benchmark the trained classifier against state-of-the-art LLMs including GPT-3.5, GPT-4, LLaMA-3, and Mistral Large in both zero-shot and few-shot contexts. Our classifier outperformed with an F1-score of 0.91, underscoring the effectiveness of our dataset.

FairHome: A Fair Housing and Fair Lending Dataset

TL;DR

FairHome introduces a 75k-example, 9-protected-category dataset for detecting fair housing and lending violations in conversational contexts. By training a BERT-based binary classifier on this data and benchmarking against state-of-the-art LLMs in zero- and few-shot settings, the authors demonstrate strong detection performance (F1 up to 0.91 and recall 0.84), outperforming large models in many scenarios. The dataset is complemented by comprehensive annotation guidelines, augmentation, and a guardrail-oriented evaluation that underscores practical utility for enforcing compliant real-estate dialogues. The work advances fair housing AI by providing a publicly accessible resource, a robust detector, and a path toward safer, more equitable AI-assisted housing decisions, with ongoing plans to expand coverage and refinement.

Abstract

We present a Fair Housing and Fair Lending dataset (FairHome): A dataset with around 75,000 examples across 9 protected categories. To the best of our knowledge, FairHome is the first publicly available dataset labeled with binary labels for compliance risk in the housing domain. We demonstrate the usefulness and effectiveness of such a dataset by training a classifier and using it to detect potential violations when using a large language model (LLM) in the context of real-estate transactions. We benchmark the trained classifier against state-of-the-art LLMs including GPT-3.5, GPT-4, LLaMA-3, and Mistral Large in both zero-shot and few-shot contexts. Our classifier outperformed with an F1-score of 0.91, underscoring the effectiveness of our dataset.
Paper Structure (38 sections, 5 figures, 4 tables)

This paper contains 38 sections, 5 figures, 4 tables.

Figures (5)

  • Figure 1: Data Collection
  • Figure 2: Protected Category Distribution in Zillow Fair Housing and Fair Lending Dataset
  • Figure 3: Compliant vs. Non-compliant Distribution for Protected Categories
  • Figure 4: Accuracy of models on sampled queries with protected categories with LLMs in zero-shot setting
  • Figure 5: Accuracy of models on sampled queries with protected categories with LLMs in few-shot setting