Table of Contents
Fetching ...

Random Forests for time-fixed and time-dependent predictors: The DynForest R package

Anthony Devaux, Cécile Proust-Lima, Robin Genuer

TL;DR

DynForest introduces a random Forest framework capable of handling time fixed and time dependent predictors for continuous, categorical, or survival outcomes. By summarizing longitudinal predictors via flexible linear mixed models, it creates time fixed features that enable standard tree based splits while accommodating irregular measurements and measurement error. The approach is implemented in the DynForest R package with DynForest and predict functions, plus utilities for OOB error, variable importance, and minimal depth analysis across survival, classification, and regression settings. The method offers practical tools for personalized dynamic predictions and interpretable variable selection in settings with rich longitudinal data and competing risks.

Abstract

The R package DynForest implements random forests for predicting a continuous, a categorical or a (multiple causes) time-to-event outcome based on time-fixed and time-dependent predictors. The main originality of DynForest is that it handles time-dependent predictors that can be endogeneous (i.e., impacted by the outcome process), measured with error and measured at subject-specific times. At each recursive step of the tree building process, the time-dependent predictors are internally summarized into individual features on which the split can be done. This is achieved using flexible linear mixed models (thanks to the R package lcmm) which specification is pre-specified by the user. DynForest returns the mean for continuous outcome, the category with a majority vote for categorical outcome or the cumulative incidence function over time for survival outcome. DynForest also computes variable importance and minimal depth to inform on the most predictive variables or groups of variables. This paper aims to guide the user with step-by-step examples for fitting random forests using DynForest.

Random Forests for time-fixed and time-dependent predictors: The DynForest R package

TL;DR

DynForest introduces a random Forest framework capable of handling time fixed and time dependent predictors for continuous, categorical, or survival outcomes. By summarizing longitudinal predictors via flexible linear mixed models, it creates time fixed features that enable standard tree based splits while accommodating irregular measurements and measurement error. The approach is implemented in the DynForest R package with DynForest and predict functions, plus utilities for OOB error, variable importance, and minimal depth analysis across survival, classification, and regression settings. The method offers practical tools for personalized dynamic predictions and interpretable variable selection in settings with rich longitudinal data and competing risks.

Abstract

The R package DynForest implements random forests for predicting a continuous, a categorical or a (multiple causes) time-to-event outcome based on time-fixed and time-dependent predictors. The main originality of DynForest is that it handles time-dependent predictors that can be endogeneous (i.e., impacted by the outcome process), measured with error and measured at subject-specific times. At each recursive step of the tree building process, the time-dependent predictors are internally summarized into individual features on which the split can be done. This is achieved using flexible linear mixed models (thanks to the R package lcmm) which specification is pre-specified by the user. DynForest returns the mean for continuous outcome, the category with a majority vote for categorical outcome or the cumulative incidence function over time for survival outcome. DynForest also computes variable importance and minimal depth to inform on the most predictive variables or groups of variables. This paper aims to guide the user with step-by-step examples for fitting random forests using DynForest.
Paper Structure (21 sections, 5 equations, 1 figure)

This paper contains 21 sections, 5 equations, 1 figure.

Figures (1)

  • Figure 1: Overall scheme of the tree building in DynForest with (A) the tree structure, (B) the node-specific treatment of time-dependent predictors to obtain time-fixed features, (C) the dichotomization of the time-fixed features, (D) the splitting rule.