Table of Contents
Fetching ...

A multi-objective combinatorial optimisation framework for large scale hierarchical population synthesis

Imran Mahmood, Nicholas Bishop, Anisoara Calinescu, Michael Wooldridge, Ioannis Zachos

TL;DR

The paper addresses scalable synthetic population generation for agent-based simulations by formulating it as a multi-objective combinatorial optimisation problem that aligns demographic distributions with contingency-table data. It adopts NSGA-II to search for Pareto-optimal populations, minimising objective functions that measure differences between real and synthetic category frequencies across attributes, expressed as $O_i(X)=\sum_{j=1}^{m_i} | f_{A_i,j}-f'_{A_i,j}(X)|$. The approach is validated via a UK census-based MSOA case study using persons and households, with rule-based validation and parallel evolutionary computing extending DEAP to produce interpretable Pareto fronts and RMSE-based assessments. The results indicate the method scales to large populations and offers a flexible framework for policymakers to tailor synthetic populations toward specific objectives for scenario testing while preserving privacy. Overall, the work contributes a novel multi-objective optimisation framework for large-scale hierarchical population synthesis and demonstrates practical applicability through UK data.

Abstract

In agent-based simulations, synthetic populations of agents are commonly used to represent the structure, behaviour, and interactions of individuals. However, generating a synthetic population that accurately reflects real population statistics is a challenging task, particularly when performed at scale. In this paper, we propose a multi objective combinatorial optimisation technique for large scale population synthesis. We demonstrate the effectiveness of our approach by generating a synthetic population for selected regions and validating it on contingency tables from real population data. Our approach supports complex hierarchical structures between individuals and households, is scalable to large populations and achieves minimal contigency table reconstruction error. Hence, it provides a useful tool for policymakers and researchers for simulating the dynamics of complex populations.

A multi-objective combinatorial optimisation framework for large scale hierarchical population synthesis

TL;DR

The paper addresses scalable synthetic population generation for agent-based simulations by formulating it as a multi-objective combinatorial optimisation problem that aligns demographic distributions with contingency-table data. It adopts NSGA-II to search for Pareto-optimal populations, minimising objective functions that measure differences between real and synthetic category frequencies across attributes, expressed as . The approach is validated via a UK census-based MSOA case study using persons and households, with rule-based validation and parallel evolutionary computing extending DEAP to produce interpretable Pareto fronts and RMSE-based assessments. The results indicate the method scales to large populations and offers a flexible framework for policymakers to tailor synthetic populations toward specific objectives for scenario testing while preserving privacy. Overall, the work contributes a novel multi-objective optimisation framework for large-scale hierarchical population synthesis and demonstrates practical applicability through UK data.

Abstract

In agent-based simulations, synthetic populations of agents are commonly used to represent the structure, behaviour, and interactions of individuals. However, generating a synthetic population that accurately reflects real population statistics is a challenging task, particularly when performed at scale. In this paper, we propose a multi objective combinatorial optimisation technique for large scale population synthesis. We demonstrate the effectiveness of our approach by generating a synthetic population for selected regions and validating it on contingency tables from real population data. Our approach supports complex hierarchical structures between individuals and households, is scalable to large populations and achieves minimal contigency table reconstruction error. Hence, it provides a useful tool for policymakers and researchers for simulating the dynamics of complex populations.
Paper Structure (8 sections, 4 equations, 8 figures)

This paper contains 8 sections, 4 equations, 8 figures.

Figures (8)

  • Figure 1: Genetic Algorithm Flow
  • Figure 2: Synthetic Population Generation using NGSAII
  • Figure 3: Individual generation and Fitness Calculation
  • Figure 4: (a) Input Tables for fitness evaluation (b)Household Composition Types
  • Figure 5: (a) Generated Persons (b) Generated Households
  • ...and 3 more figures