Table of Contents
Fetching ...

Discussion of "Matrix Completion When Missing Is Not at Random and Its Applications in Causal Panel Data Models"

Eli Ben-Michael, Avi Feller

Abstract

Choi and Yuan (2025) propose a novel approach to applying matrix completion to the problem of estimating causal effects in panel data. The key insight is that even in the presence of structured patterns of missing data -- i.e. selection into treatment -- matrix completion can be effective if the number of treated observations is small relative to the number of control observations. We applaud the authors for their insightful and interesting paper. We discuss this proposal from two complementary perspectives. First, we situate their proposal as an example of a "split-apply-combine" strategy that underlies many modern panel data estimators, including difference-in-differences and synthetic control approaches. Second, we discuss the issue of the statistical "last mile problem" -- the gap between theory and practice -- and offer suggestions on how to partially address it. We conclude by considering the challenges of estimating the impacts of public policies using panel data and apply the approach to a study on the effect of right to carry laws on violent crime.

Discussion of "Matrix Completion When Missing Is Not at Random and Its Applications in Causal Panel Data Models"

Abstract

Choi and Yuan (2025) propose a novel approach to applying matrix completion to the problem of estimating causal effects in panel data. The key insight is that even in the presence of structured patterns of missing data -- i.e. selection into treatment -- matrix completion can be effective if the number of treated observations is small relative to the number of control observations. We applaud the authors for their insightful and interesting paper. We discuss this proposal from two complementary perspectives. First, we situate their proposal as an example of a "split-apply-combine" strategy that underlies many modern panel data estimators, including difference-in-differences and synthetic control approaches. Second, we discuss the issue of the statistical "last mile problem" -- the gap between theory and practice -- and offer suggestions on how to partially address it. We conclude by considering the challenges of estimating the impacts of public policies using panel data and apply the approach to a study on the effect of right to carry laws on violent crime.
Paper Structure (12 sections, 3 figures)

This paper contains 12 sections, 3 figures.

Figures (3)

  • Figure 1: Split-apply-combine strategies for panel data estimators.
  • Figure 2: Treatment timing for the RTC data. Black indicates that the state had not adopted an RTC law at that time, while white indicates that it had and so $Y_{it}(\infty)$ is missing.
  • Figure 3: Estimates of the event-time effects $\tau_k^\text{event}$ for $k=-20,\ldots,10$ using (i) nuclear-norm regularized matrix completion on the entire matrix; (ii) the CY estimator; (iii) partially pooled synthetic controls benmichael_multisynth_2022; (iv) the Gsynth estimator xu_gsynth_2017; and (v) a DiD estimator. Estimators (i)-(iv) are estimated without (left panel) and with (right panel) residualizing out unit and time fixed effects.