Table of Contents
Fetching ...

Autonomous AI Agents for Real-Time Affordable Housing Site Selection: Multi-Objective Reinforcement Learning Under Regulatory Constraints

Olaf Yunus Laitinen Imanov, Duygu Erisken, Derya Umut Kulali, Taner Yilmaz, Rana Irem Turhan

TL;DR

AURA (Autonomous Urban Resource Allocator), a hierarchical multi-agent reinforcement learning system for real-time affordable housing site selection under hard regulatory constraints, model the task as a constrained multi-objective Markov decision process optimizing accessibility, environmental impact, construction cost, and social equity while enforcing feasibility.

Abstract

Affordable housing shortages affect billions, while land scarcity and regulations make site selection slow. We present AURA (Autonomous Urban Resource Allocator), a hierarchical multi-agent reinforcement learning system for real-time affordable housing site selection under hard regulatory constraints (QCT, DDA, LIHTC). We model the task as a constrained multi-objective Markov decision process optimizing accessibility, environmental impact, construction cost, and social equity while enforcing feasibility. AURA uses a regulatory-aware state encoding 127 federal and local constraints, Pareto-constrained policy gradients with feasibility guarantees, and reward decomposition separating immediate costs from long-term social outcomes. On datasets from 8 U.S. metros (47,392 candidate parcels), AURA attains 94.3% regulatory compliance and improves Pareto hypervolume by 37.2% over strong baselines. In a New York City 2026 case study, it reduces selection time from 18 months to 72 hours and identifies 23% more viable sites; chosen sites have 31% better transit access and 19% lower environmental impact than expert picks.

Autonomous AI Agents for Real-Time Affordable Housing Site Selection: Multi-Objective Reinforcement Learning Under Regulatory Constraints

TL;DR

AURA (Autonomous Urban Resource Allocator), a hierarchical multi-agent reinforcement learning system for real-time affordable housing site selection under hard regulatory constraints, model the task as a constrained multi-objective Markov decision process optimizing accessibility, environmental impact, construction cost, and social equity while enforcing feasibility.

Abstract

Affordable housing shortages affect billions, while land scarcity and regulations make site selection slow. We present AURA (Autonomous Urban Resource Allocator), a hierarchical multi-agent reinforcement learning system for real-time affordable housing site selection under hard regulatory constraints (QCT, DDA, LIHTC). We model the task as a constrained multi-objective Markov decision process optimizing accessibility, environmental impact, construction cost, and social equity while enforcing feasibility. AURA uses a regulatory-aware state encoding 127 federal and local constraints, Pareto-constrained policy gradients with feasibility guarantees, and reward decomposition separating immediate costs from long-term social outcomes. On datasets from 8 U.S. metros (47,392 candidate parcels), AURA attains 94.3% regulatory compliance and improves Pareto hypervolume by 37.2% over strong baselines. In a New York City 2026 case study, it reduces selection time from 18 months to 72 hours and identifies 23% more viable sites; chosen sites have 31% better transit access and 19% lower environmental impact than expert picks.
Paper Structure (35 sections, 16 equations, 6 figures, 5 tables, 1 algorithm)

This paper contains 35 sections, 16 equations, 6 figures, 5 tables, 1 algorithm.

Figures (6)

  • Figure 1: Affordable housing deficit across eight major U.S. metropolitan areas (2026 data). New York City exhibits the most severe shortage at 342,000 units, followed by Los Angeles at 287,000 units. Total deficit across these cities exceeds 1.2 million units.
  • Figure 2: AURA hierarchical multi-agent architecture. The Coordination Agent (red) orchestrates specialized agents for geospatial analysis, regulatory compliance, and multi-objective optimization (blue), which inform the Execution Agent (green). Dashed arrows indicate data flow from urban data sources (gray).
  • Figure 3: Hypervolume comparison across eight cities. AURA consistently outperforms baselines, with largest gains in NYC (34.5% over HES) and Philadelphia (37.4% over HES). Error bars indicate standard deviation over 10 runs.
  • Figure 4: Pareto front comparison for NYC: Accessibility vs. Cost. AURA discovers solutions dominating HES across the entire front, achieving higher accessibility at every cost level. Shaded region indicates AURA's dominance area.
  • Figure 5: Training convergence measured by average hypervolume over epochs. AURA converges faster (200 epochs to 95% final HV) compared to Single-Policy MORL (280 epochs) and NSGA-II (350 epochs). Shaded regions indicate standard deviation over 5 training runs.
  • ...and 1 more figures