Table of Contents
Fetching ...

Functional Moments Regression

Mingyuan Li, Martin A. Lindquist, Edward Gunning, Ciprian Crainiceanu

Abstract

The Gaussian Process (GP) assumption is often used in functional data analysis. We propose a method to assess departures from the GP assumption, both in terms of the shape of the distribution and its potential dependence on covariates, using a sequence of functional moment regressions. Our methods are inspired by and applied to objectively measured minute-level physical activity data from the National Health and Nutrition Examination Survey (NHANES) 2011-2014 study. In this setting, we find that the GP assumption is not satisfied, quantify the associations between functional moments and covariates, and show that standard data transformations, such as the log transformation, do not resolve the discrepancy between assumptions and reality. We further show that when the effect sizes are moderate, inference on the functional fixed effects is largely unaffected by departures from the GP assumption. However, when effect sizes are small, both inference and prediction of subject-level data can be strongly affected. Extensive simulations support these findings. This pragmatic paper presents new methods for real data analysis, with implications for statistical methodology and for understanding human activity and health.

Functional Moments Regression

Abstract

The Gaussian Process (GP) assumption is often used in functional data analysis. We propose a method to assess departures from the GP assumption, both in terms of the shape of the distribution and its potential dependence on covariates, using a sequence of functional moment regressions. Our methods are inspired by and applied to objectively measured minute-level physical activity data from the National Health and Nutrition Examination Survey (NHANES) 2011-2014 study. In this setting, we find that the GP assumption is not satisfied, quantify the associations between functional moments and covariates, and show that standard data transformations, such as the log transformation, do not resolve the discrepancy between assumptions and reality. We further show that when the effect sizes are moderate, inference on the functional fixed effects is largely unaffected by departures from the GP assumption. However, when effect sizes are small, both inference and prediction of subject-level data can be strongly affected. Extensive simulations support these findings. This pragmatic paper presents new methods for real data analysis, with implications for statistical methodology and for understanding human activity and health.

Paper Structure

This paper contains 21 sections, 23 equations, 9 figures, 3 tables, 1 algorithm.

Figures (9)

  • Figure 1: Smooth estimators of the mean, standard deviation, skewness, and excess kurtosis of MIMS as functions of time (x-axis). Smoothing is done using penalized cyclic cubic splines.
  • Figure 2: Smooth estimators of the mean, standard deviation, skewness, and excess kurtosis of transformed MIMS as functions of time (x-axis). Smoothing is done using penalized cyclic cubic splines.
  • Figure 3: Estimated fixed effects (black solid lines) of subject characteristics on logged MIMS data in NHANES study. Pointwise (blue) and Correlation and Multiplicity Adjusted (CMA, red) 95% confidence bands. Continuous predictors such as age, BMI and poverty income ratio are centered. From left to right, top to bottom: intercept term, age (in years), gender (female), education: high school equivalent, and more than high school.
  • Figure 4: Estimated noise variance after accounting for time-varying fixed effects of the mean (top panel) and full residual variance (bottom panel) of the log-transformed MIMS outcome. Male and female individuals aged 30 and 70 who are non-hispanic white, have education above high school level, have no history of coronary heart disease, BMI of 21 and poverty index ratio (PIR) of 2.5 are presented. The noise variance estimator is smoothed using penalized cyclic cubic splines. Pointwise confidence intervals (blue) and Correlation and Multiplicity Adjusted (CMA, red) 95% confidence bands are included.
  • Figure 5: Estimated ratio of residual variance between females aged 30 and 70 (top panel) and between female and males aged 30 (bottom panel). Both panels correspond to individuals who are non-hispanic white, have education above high school level, no history of coronary heart disease, BMI of 21 and poverty index ratio (PIR) of 2.5; Pointwise confidence intervals (blue) and asymmetric Correlation and Multiplicity Adjusted (CMA, red) 95% confidence bands are included.
  • ...and 4 more figures