Table of Contents
Fetching ...

The illusion of households as entities in social networks

Izabel Aguiar, Philip S. Chodrow, Johan Ugander

Abstract

Data recording connections between people in communities and villages are collected and analyzed in various ways, most often as either networks of individuals or as networks of households. These two networks can differ in substantial ways. The methodological choice of which network to study, therefore, is an important aspect in both study design and data analysis. In this work, we consider various key differences between household and individual social network structure, and ways in which the networks cannot be used interchangeably. In addition to formalizing the choices for representing each network, we explore the consequences of how the results of social network analysis change depending on the choice between studying the individual and household network -- from determining whether networks are assortative or disassortative to the ranking of influence-maximizing nodes. As our main contribution, we draw upon related work to propose a set of systematic recommendations for determining the relevant network representation to study. Our recommendations include assessing a series of entitativity criteria and relating these criteria to theories and observations about patterns and norms in social dynamics at the household level: notably, how information spreads within households and how power structures and gender roles affect this spread. We draw upon the definition of an illusion of entitativity to identify cases wherein grouping people into households does not satisfy these criteria or adequately represent given cultural or experimental contexts. Given the widespread use of social network data for studying communities, there is broad impact in understanding which network to study and the consequences of that decision. We hope that this work gives guidance to practitioners and researchers collecting and studying social network data.

The illusion of households as entities in social networks

Abstract

Data recording connections between people in communities and villages are collected and analyzed in various ways, most often as either networks of individuals or as networks of households. These two networks can differ in substantial ways. The methodological choice of which network to study, therefore, is an important aspect in both study design and data analysis. In this work, we consider various key differences between household and individual social network structure, and ways in which the networks cannot be used interchangeably. In addition to formalizing the choices for representing each network, we explore the consequences of how the results of social network analysis change depending on the choice between studying the individual and household network -- from determining whether networks are assortative or disassortative to the ranking of influence-maximizing nodes. As our main contribution, we draw upon related work to propose a set of systematic recommendations for determining the relevant network representation to study. Our recommendations include assessing a series of entitativity criteria and relating these criteria to theories and observations about patterns and norms in social dynamics at the household level: notably, how information spreads within households and how power structures and gender roles affect this spread. We draw upon the definition of an illusion of entitativity to identify cases wherein grouping people into households does not satisfy these criteria or adequately represent given cultural or experimental contexts. Given the widespread use of social network data for studying communities, there is broad impact in understanding which network to study and the consequences of that decision. We hope that this work gives guidance to practitioners and researchers collecting and studying social network data.

Paper Structure

This paper contains 29 sections, 2 theorems, 9 equations, 9 figures.

Key Result

Proposition 1

Consider a random graph $G(V,E)$ generated by a $G(n,p)$. A corresponding unweighted household graph $G'(V',E')$ is constructed by constructing $m$ disjoint node sets $H_1, H_2, \dots, H_R$ of size $\ell_1, \ell_2, \dots \ell_R$ by choosing nodes uniformly at random, and contracting these sets into

Figures (9)

  • Figure 1: This work considers the choice of which network is the most appropriate to study in a given context, a choice which we present as consequential for meaningful empirical network analysis. In this diagram we show how individual networks (left, blue) are often translated to household networks (right, red). In what we refer to throughout as the individual network, nodes are individuals and an edge between two individuals is collected through a survey. In what we refer to as the household network, nodes are households and an edge between two households is usually determined by aggregating the relationships between individuals in that household. We specify this decision to represent household edges in this way as the basic household contraction rule and propose alternate methods for defining edges between households in \ref{['subsubsec:cont_rules']}. In the diagram here, we also represent individuals within the same household as completely connected to one another, an aspect of some individual network datasets which we discuss in more detail in \ref{['subsub:local']}. We review how individual and household networks are collected and studied in practice in \ref{['tab:review']}.
  • Figure 2: In this work we provide a systematic recommendation for determining whether the household or individual network should be studied given a particular context and experimental goal or intervention. The decision tree here poses a contextual evaluation of a set of entitativity criteria— proximity, similarity, common fate, and internal diffusion, which we discuss in detail in \ref{['sec:rec']}— to determine an appropriate level of node aggregation, as well as to suggest how to weight edges. In \ref{['fig:ex_decision']} we apply these recommendations to three separate examples to determine an appropriate network to analyze.
  • Figure 3: The household and individual networks from banerjee2013 have substantively different interpretations when considering both the degree assortativity and clustering coefficients.
  • Figure 4: The inversity of the banerjee2013 networks at the individual and household level. In light grey, we plot the histogram of the inversity of the individual networks across villages when the intrahousehold edges are removed.
  • Figure 5: The intersection between the sets of the 10 most influential households, found by greedily maximizing the average proportion of nodes reached over 1000 independent cascades on the household and individual networks from banerjee2013. To compare sets, we map the set of the 10 influence maximizing individuals to their corresponding households.
  • ...and 4 more figures

Theorems & Definitions (4)

  • Proposition 1
  • proof
  • Proposition 2
  • proof