Table of Contents
Fetching ...

A Hypothesis-First Framework for Mechanistic Modeling in Neuroimaging

Dominic Boutet, Sylvain Baillet

TL;DR

The paper tackles the challenge of extracting mechanistic insight from neuroimaging data by introducing a hypothesis-first framework that tests mechanistic hypotheses before parameter estimation. It formalizes a two-part innovation: (i) evaluating model behavior under feature generalization constraints by computing $E[\tilde{Y}_{\alpha,\theta}|Z]$ and (ii) constructing mirror statistical models $\tilde{R}_{\alpha}$ to compare with empirical relationships, enabling direct accept/reject decisions. Using synthetic data from Wilson-Cowan neural mass models, the authors demonstrate that under- and over-parameterized, as well as structurally invalid hypotheses, are rejected, while appropriately specified models are retained. The framework thereby provides a practical pre-inference filter that improves interpretability and generalization of downstream inferences, while lowering the barrier for researchers without advanced dynamical-systems training. This approach complements traditional parameter estimation and holds promise for integrating mechanistic modeling more broadly into neuroimaging analyses.

Abstract

Turning rich neuroimaging data into mechanistic insight remains challenging. Statistical models capture associations but remain largely agnostic to underlying mechanisms. Biophysical models embody candidate mechanisms but remain difficult to deploy without specialized expertise. Here, we present a hypothesis-first framework recasting model specifications as testable mechanistic hypotheses and streamlines the procedure for rejecting inappropriate hypotheses before moving to typical analyses. The key innovation is an expectation of model behavior under feature generalization constraints: we compute the model's expected $Y$ output across the parameter space based on the likelihood for a broader/distinct feature $Z$. Mirror statistical models are derived from these expected outputs and compared to the empirical ones with standard statistics. In synthetic experiments, our framework rejected mis-specified hypotheses and penalized unnecessary degrees of freedom while retaining valid hypotheses. These results demonstrate a practical hypothesis-driven approach for using mechanistic models in neuroimaging without requiring advanced training, complementing traditional analyses.

A Hypothesis-First Framework for Mechanistic Modeling in Neuroimaging

TL;DR

The paper tackles the challenge of extracting mechanistic insight from neuroimaging data by introducing a hypothesis-first framework that tests mechanistic hypotheses before parameter estimation. It formalizes a two-part innovation: (i) evaluating model behavior under feature generalization constraints by computing and (ii) constructing mirror statistical models to compare with empirical relationships, enabling direct accept/reject decisions. Using synthetic data from Wilson-Cowan neural mass models, the authors demonstrate that under- and over-parameterized, as well as structurally invalid hypotheses, are rejected, while appropriately specified models are retained. The framework thereby provides a practical pre-inference filter that improves interpretability and generalization of downstream inferences, while lowering the barrier for researchers without advanced dynamical-systems training. This approach complements traditional parameter estimation and holds promise for integrating mechanistic modeling more broadly into neuroimaging analyses.

Abstract

Turning rich neuroimaging data into mechanistic insight remains challenging. Statistical models capture associations but remain largely agnostic to underlying mechanisms. Biophysical models embody candidate mechanisms but remain difficult to deploy without specialized expertise. Here, we present a hypothesis-first framework recasting model specifications as testable mechanistic hypotheses and streamlines the procedure for rejecting inappropriate hypotheses before moving to typical analyses. The key innovation is an expectation of model behavior under feature generalization constraints: we compute the model's expected output across the parameter space based on the likelihood for a broader/distinct feature . Mirror statistical models are derived from these expected outputs and compared to the empirical ones with standard statistics. In synthetic experiments, our framework rejected mis-specified hypotheses and penalized unnecessary degrees of freedom while retaining valid hypotheses. These results demonstrate a practical hypothesis-driven approach for using mechanistic models in neuroimaging without requiring advanced training, complementing traditional analyses.

Paper Structure

This paper contains 33 sections, 27 equations, 7 figures, 1 table.

Figures (7)

  • Figure 3: Overview of the proposed hypothesis-first framework. Overview of the hypothesis-first framework for mechanistic model evaluation. Empirical data are first transformed into features of interest (blue), defining an empirical statistical relationship between an independent feature $S$ and a target feature $Y$. Candidate mechanistic hypotheses are then formalized as mathematical models that generate simulated data under different parameterizations (red). For each hypothesis, an expectation-based procedure computes the expected model output in $Y$ by weighting parameter configurations according to their ability to reproduce a broader or distinct feature $Z$, yielding a mirror dataset (purple). This mirror dataset is used to derive a mirror statistical model, which is directly compared to the empirical statistical model using predefined statistical criteria (green). This comparison enables systematic rejection of ill-posed, under-parameterized, over-parameterized, or structurally invalid mechanistic hypotheses prior to parameter estimation or inference.
  • Figure 4: Synthetic dataset summary for Experiment 1. Each column illustrates a key component of the simulation. The first column shows the simulated model output $\gamma$ and the corresponding generalization feature $Z$ across 20 data points colored according to their respective $S$ value. The middle column shows the template (green) versus modified (red) values of the inversion parameter $b_e$ (top) and bias parameter $c_{ie}$ (bottom) for each data point. The last column displays the relationship between the structural feature $S$ and the target feature $Y$, along with the fitted statistical model $R$, which corresponds to a univariate ordinary least squares linear regression. These panels illustrate all relevant components of the synthetic data with manually inverted empirical relationship between $S$ and $Y$, providing a ground truth against which mechanistic hypotheses can be evaluated.
  • Figure 5: Empirical and mirror statistical models for Experiment 1 hypotheses. Each panel compares the empirical statistical model $R$ (dashed red line) to the mirror statistical model $\tilde{R}_\alpha$ (full black line) generated by the framework for three mechanistic hypotheses ($\alpha_1$ to $\alpha_3$). Their associated datasets are shown as red cross and colored dots respectively. For mirror data points, the color scheme of the dots corresponds to the likelihood value from the expected model output $\tilde{Z}_{\alpha}$ ($\log_{10}$ scale), with yellow dots corresponding to high likelihood and black dots corresponding to low likelihood, see supplementary figure for examples. Binary outcomes indicate whether model coefficients (intercept, slope) and residual distributions (CDF, mean, variance) differed significantly between empirical and mirror models (details of the statistics in the source file). Hypotheses lacking key parameters ($\alpha_2$) or containing unnecessary degrees of freedom ($\alpha_3$) are rejected according to all five tests, while the true hypothesis ($\alpha_1$) is retained.
  • Figure 6: Expected Parameters for Experiment 1. True parameter values (red crosses) and expected parameter estimates (colored dots; same color scheme as in Figure 3) are shown for all three hypotheses after passing through the framework. Note: These parameter values do not represent the parameter configuration associated with the $\tilde{Y}_{\alpha}$ and $\tilde{Z}_{\alpha}$ outputs shown in Figure 3 and the associated supplementary figure, instead they locate the center of mass of the likelihood distribution used to compute them within the parameter space. Overlapping dots reflect that the correct regions of the parameter space contributed most to the computation of the mirror dataset, as is the case for the true hypothesis $\alpha_1$ (left column). Mismatching dots reflects either the absence of a key free parameter, as is the case for $\alpha_2$ (middle column), or the presence of interference due to multiple solutions for reproducing $Z$, as is the case for $\alpha_3$ (right column).
  • Figure 7: Synthetic dataset summary for Experiment 2. Each column illustrates a key component of the simulation. The first column shows the simulated model output $\gamma$ and the corresponding generalization feature $Z$ (only for node $n_0$) across 20 data points colored according to their respective $S$ value. The dashed vertical lines in this first column represent the bounds of the frequency range used to compute $Y$. The middle column shows the template (green) versus modified (red) values of the inversion parameter $b_e$ (top) and bias parameter $c_{ie}$ (bottom) for each data point. The last column displays the relationship between the structural feature $S$ and the target feature $Y$, along with the fitted statistical model $R$, which corresponds to a univariate ordinary least squares linear regression. These panels illustrate all relevant components of the synthetic data with manually inverted empirical relationship between $S$ and $Y$, providing a ground truth against which mechanistic hypotheses can be evaluated. This setup introduces stronger overlap between $Y$ and $Z$ and parameter sharing constraints, enabling evaluation of the framework under more complex modeling conditions.
  • ...and 2 more figures