Bayesian Causal Forests for Longitudinal Data: Assessing the Impact of Part-Time Work on Growth in High School Mathematics Achievement
Nathan McJames, Ann O'Shea, Andrew Parnell
TL;DR
This paper develops a longitudinal extension of Bayesian Causal Forests (LBCF) to jointly estimate individual growth trajectories in mathematics achievement and the heterogeneous causal impact of part-time work using two waves of HSLS data. The model decomposes growth into a baseline trajectory and a period-specific treatment effect, incorporating time-varying covariates, a clever propensity-score term, and missing-data handling within a Bayesian nonparametric framework based on BART ensembles. Simulation studies demonstrate strong predictive performance and reliable uncertainty quantification for growth and heterogeneous effects, outperforming standard BART, BCF, and GRF in key settings; HSLS application reveals a negative average effect of intensive part-time work on growth ($ATE \approx -0.08$) with substantial heterogeneity and a potential positive effect for students with low school belonging. These results suggest nuanced policy implications: while intensive work generally dampens growth, targeted supports or alternative activities might mitigate harms and even benefit certain subgroups; the method offers a flexible tool for analyzing growth and heterogeneity in longitudinal causal settings across education data and beyond.
Abstract
Modelling growth in student achievement is a significant challenge in the field of education. Understanding how interventions or experiences such as part-time work can influence this growth is also important. Traditional methods like difference-in-differences are effective for estimating causal effects from longitudinal data. Meanwhile, Bayesian non-parametric methods have recently become popular for estimating causal effects from single time point observational studies. However, there remains a scarcity of methods capable of combining the strengths of these two approaches to flexibly estimate heterogeneous causal effects from longitudinal data. Motivated by two waves of data from the High School Longitudinal Study, the NCES' most recent longitudinal study which tracks a representative sample of over 20,000 students in the US, our study introduces a longitudinal extension of Bayesian Causal Forests. This model allows for the flexible identification of both individual growth in mathematical ability and the effects of participation in part-time work. Simulation studies demonstrate the predictive performance and reliable uncertainty quantification of the proposed model. Results reveal the negative impact of part time work for most students, but hint at potential benefits for those students with an initially low sense of school belonging. Clear signs of a widening achievement gap between students with high and low academic achievement are also identified. Potential policy implications are discussed, along with promising areas for future research.
