The Categorical Instrumental Variable Model: Characterization, Partial Identification, and Statistical Inference

Yilin Song; F. Richard Guo; K. C. Gary Chan; Thomas S. Richardson

The Categorical Instrumental Variable Model: Characterization, Partial Identification, and Statistical Inference

Yilin Song, F. Richard Guo, K. C. Gary Chan, Thomas S. Richardson

TL;DR

This work analyzes partial identification for categorical instrumental variable models where $Z$, $X$, and $Y$ are finite-valued. It derives a simple closed-form characterization of the joint counterfactual distribution $P'(Y(x_1),\dots,Y(x_K))$ via linear inequalities that link counterfactuals to the observed $P(X,Y\mid Z)$, and shows these are necessary, sufficient, and non-redundant across five IV models defined by exclusion and independence variants. The sufficiency proof uses Strassen's theorem to construct couplings, yielding a polyhedral description that supports sharp bounds on linear functionals such as the average treatment effect and enables a falsification test. For inference, the paper develops a conservative finite-sample CI framework based on a KL-divergence tail bound, implemented through convex programming, and demonstrates practical utility with the Minneapolis Domestic Violence Experiment data, where multi-arm IV analysis is essential.

Abstract

We study categorical instrumental variable (IV) models with instrument, treatment, and outcome taking finitely many values. We derive a simple closed-form characterization of the set of joint distributions of potential outcomes that are compatible with a given observed data distribution in terms of a set of inequalities. These inequalities unify several different IV models defined by versions of the independence and exclusion restriction assumptions and are shown to be non-redundant. Finally, given a set of linear functionals of the joint counterfactual distribution, such as pairwise average treatment effects, we construct confidence intervals with simultaneous finite-sample coverage, using a tail bound on the Kullback--Leibler divergence. We illustrate our method using data from the Minneapolis Domestic Violence Experiment.

The Categorical Instrumental Variable Model: Characterization, Partial Identification, and Statistical Inference

TL;DR

This work analyzes partial identification for categorical instrumental variable models where

, and

are finite-valued. It derives a simple closed-form characterization of the joint counterfactual distribution

via linear inequalities that link counterfactuals to the observed

, and shows these are necessary, sufficient, and non-redundant across five IV models defined by exclusion and independence variants. The sufficiency proof uses Strassen's theorem to construct couplings, yielding a polyhedral description that supports sharp bounds on linear functionals such as the average treatment effect and enables a falsification test. For inference, the paper develops a conservative finite-sample CI framework based on a KL-divergence tail bound, implemented through convex programming, and demonstrates practical utility with the Minneapolis Domestic Violence Experiment data, where multi-arm IV analysis is essential.

Abstract

Paper Structure (26 sections, 17 theorems, 61 equations, 6 figures, 6 tables, 1 algorithm)

This paper contains 26 sections, 17 theorems, 61 equations, 6 figures, 6 tables, 1 algorithm.

Introduction
Motivating example: Minneapolis domestic violence experiment
Contribution of the paper
Related prior work
Outline
Notation, Assumptions, and Models
Assumptions
Main Results
Proof of necessity
Strassen's theorem and proof of sufficiency
Eliminating redundant inequalities
Statistical inference on partial identification bounds
Motivating Example Revisited
Conclusion and discussion
Proof of necessity for \ref{['theorem:main-result']}
...and 11 more sections

Key Result

Lemma 2.1

We have ${\cal M}_1 \subset {\cal M}_2 \subset {\cal M}_3$, ${\cal M}_1 \subset {\cal M}_4$ and ${\cal M}_2 \subset {\cal M}_5$.

Figures (6)

Figure 1.1: Directed acyclic graph (DAG) representing the assumptions of a valid instrumental variable, where the dashed edges are assumed to be absent.
Figure 2.1: Nested structure between models ${{\cal M}}_1$--${{\cal M}}_5$.
Figure 2.2: Graphical representations of independence and exclusion assumptions discussed in Section \ref{['sec:assumptions']}. ${\cal M}_1$ and ${\cal M}_4$ do not have confounding between $Z$ and $X$ and independence is encoded using the extension of d-separation to acyclic graphs with bi-directed ($\leftrightarrow$) edges richardson-admg:2003; ${\cal M}_2$, ${\cal M}_3$ and ${\cal M}_5$ allow confounding between $Z$ and $X$ and their independence assumptions follow from Pearl's d-separation for directed acyclic graphs. (Note that (e) encodes a slightly stronger version of (A\ref{['assumption:indep']}-4) with $X(z)$ replacing $X$.) In (a) and (b), when a variable is connected to its parents with double edges ($\Rightarrow$), the variable is a deterministic function of its parents. The individual exclusion assumption (A\ref{['assumption:exclusion']}-1) in ${\cal M}_1$ and ${\cal M}_2$ follows because $Y$ is determined by $\{Y(x,z)\}$ and $X$ (and not $Z$); The joint stochastic exclusion assumption (A\ref{['assumption:exclusion']}-2) in ${\cal M}_3$ cannot be (easily) represented graphically and is stated explicitly; individual exclusion in $M_4$ is implied because the SWIG contains $Y(x)$ rather than $Y(x,z)$; the latent stochastic exclusion assumption in ${\cal M}_5$ is signified by the absence of an edge from $z$ to $Y(x,z)$; indeed, it holds that $z$ is d-separated from $Y(x,z)$ given $U$malinsky19bRichardsonRobins-2023.
Figure 5.1: Illustration of pairs $(\bm{a},\bm{b}) \in {\cal A} \times {\cal B}$ when $K=M=2$ under a fixed instrument arm $z$. Each edge corresponds to a coherent pair.
Figure B.1: For each $B$ (vertices in box), $\mathcal{R}_{C}(B)$ consists of both blue edges (between ${\cal N}_{\mathcal{R}_C}'(B)$ and $B$) and red edges (between $\overline{{\cal N}_{\mathcal{R}_C}'(B)}$ and $\overline{B}$). By \ref{['lemma:reverse-ineq']}, there is a 1-1 correspondence between $\cal V\in \cal A$ and $B\subseteq\cal B$, and we have ${\cal V}=\overline{{\cal N}_{\mathcal{R}_C}'(B)}$. Note that the edges in (c) are contained by those in (a).
...and 1 more figures

Theorems & Definitions (44)

Lemma 2.1
proof
Theorem 3.1
Remark 3.1
Remark 3.2
Corollary 3.1
Theorem 3.2
Corollary 3.2
Remark 3.3
Example 1
...and 34 more

The Categorical Instrumental Variable Model: Characterization, Partial Identification, and Statistical Inference

TL;DR

Abstract

The Categorical Instrumental Variable Model: Characterization, Partial Identification, and Statistical Inference

Authors

TL;DR

Abstract

Table of Contents

Key Result

Figures (6)

Theorems & Definitions (44)