Mediation Analysis (Hayes Model 4)
Mediation & ModerationTests whether the effect of an independent variable (X) on an outcome (Y) is transmitted through one or more mediating variables (M), decomposing the total effect into direct and indirect components.
When to Use
Use this analysis when you have a theoretical reason to believe that X influences Y through an intermediary mechanism M. For example, testing whether a training programme (X) improves job performance (Y) through increased self-efficacy (M), or whether drug treatment (X) reduces tumour size (Y) by suppressing a specific biomarker (M).
Assumptions
- Correct causal ordering: X precedes M, which precedes Y in time or logic.
- No unmeasured confounders of the X-M, M-Y, or X-Y relationships.
- For continuous outcomes: linearity of all regression paths.
- Residuals are independent and normally distributed (for parametric inference; bootstrap relaxes this).
- Adequate sample size for bootstrap confidence intervals (N >= 50, ideally N >= 100).
Required Inputs
| Input | Type | Notes |
|---|---|---|
| Independent Variable (X) | Numeric / Categorical | The predictor variable |
| Mediator (M) | Numeric / Categorical | The proposed mediating variable |
| Dependent Variable (Y) | Numeric / Categorical | The outcome variable |
| Covariates (optional) | Numeric / Categorical | Optional control variables |
| Parameter | Default | Options |
|---|---|---|
| Bootstrap Samples | 5000 | 1000 - 10000 |
Output Metrics
| Metric | What it means |
|---|---|
| Outcome Variable | Name of the dependent variable. |
| Predictor Variable | Name of the independent variable. |
| Mediator Variable | Name of the mediating variable. |
| N | Sample size. |
| N Bootstrap | Number of bootstrap resamples used. |
| Mediator Model R-squared | R-squared for the regression of M on X (path a model). |
| Mediator Model F | F-statistic for the mediator model. |
| Path a (X to M) | Coefficient for the effect of X on M. With SE, t, and p-value. |
| Outcome Model R-squared | R-squared for the regression of Y on X and M (paths b and c' model). |
| Outcome Model F | F-statistic for the outcome model. |
| Path b (M to Y, controlling for X) | Coefficient for the effect of M on Y, holding X constant. |
| Path c' (Direct Effect) | Effect of X on Y after controlling for M. With SE, t, p, and bootstrap CI. |
| Indirect Effect (a*b) | Product of paths a and b. The effect of X on Y transmitted through M. With bootstrap SE and CI. |
| Total Effect (c) | Total effect of X on Y (direct + indirect). With SE, t, and p-value. |
| Proportion Mediated | Indirect effect / total effect. The fraction of the total effect that passes through M. With bootstrap CI. |
| Sobel Test Effect | Sobel test statistic for the indirect effect (normal-theory approximation). |
| Sobel Test SE | Standard error for the Sobel test. |
| Sobel Test z | Z-statistic for the Sobel test. |
| Sobel Test p | P-value for the Sobel test. |
Interpretation
- If the bootstrap confidence interval for the indirect effect (a*b) excludes zero, mediation is statistically significant. This is the recommended test.
- The Sobel test is a traditional alternative but assumes normality of the indirect effect distribution, which is often violated. Prefer the bootstrap CI.
- The direct effect (c') represents the portion of X's effect on Y that does not pass through M. If c' is non-significant but the indirect effect is significant, this suggests full mediation.
- Proportion mediated tells you how much of the total effect goes through the mediator. For example, a proportion of 0.40 means 40% of the effect is mediated.
- A significant total effect (c) is not required for mediation to be significant. Indirect-only mediation can occur when the direct and indirect effects have opposite signs (suppression).
Common Pitfalls
- Mediation analysis cannot prove causation from cross-sectional data. Longitudinal designs with temporal ordering provide stronger evidence.
- Unmeasured confounders of the M-Y relationship are a major threat. Sensitivity analyses (e.g., Imai's rho) can assess how robust the findings are.
- The proportion mediated is unstable when the total effect is near zero. Avoid interpreting it as a precise quantity in such cases.
- Multiple mediators operating in parallel or in series require more complex models (e.g., Hayes Model 6 for serial mediation). Model 4 assumes a single mediator path.
How It Works
- Fit the mediator model: regress M on X (and covariates) to obtain path a.
- Fit the outcome model: regress Y on both X and M (and covariates) to obtain path b (M to Y) and path c' (direct effect of X on Y).
- Compute the indirect effect as the product a * b.
- Use bootstrapping: resample the data many times, re-estimate the indirect effect each time, and construct a percentile confidence interval from the bootstrap distribution.
Citations
References
- Baron, R. M., & Kenny, D. A. (1986). The moderator-mediator variable distinction in social psychological research. Journal of Personality and Social Psychology, 51(6), 1173-1182.
- Sobel, M. E. (1982). Asymptotic confidence intervals for indirect effects in structural equation models. Sociological Methodology, 13, 290-312.
- Hayes, A. F. (2022). Introduction to Mediation, Moderation, and Conditional Process Analysis (3rd ed.). Guilford Press (PROCESS framework).