Skip to content

Mediation Analysis (Hayes Model 4)

Mediation & Moderation

Tests whether the effect of an independent variable (X) on an outcome (Y) is transmitted through one or more mediating variables (M), decomposing the total effect into direct and indirect components.

When to Use

Use this analysis when you have a theoretical reason to believe that X influences Y through an intermediary mechanism M. For example, testing whether a training programme (X) improves job performance (Y) through increased self-efficacy (M), or whether drug treatment (X) reduces tumour size (Y) by suppressing a specific biomarker (M).

Assumptions

  • Correct causal ordering: X precedes M, which precedes Y in time or logic.
  • No unmeasured confounders of the X-M, M-Y, or X-Y relationships.
  • For continuous outcomes: linearity of all regression paths.
  • Residuals are independent and normally distributed (for parametric inference; bootstrap relaxes this).
  • Adequate sample size for bootstrap confidence intervals (N >= 50, ideally N >= 100).

Required Inputs

InputTypeNotes
Independent Variable (X)Numeric / CategoricalThe predictor variable
Mediator (M)Numeric / CategoricalThe proposed mediating variable
Dependent Variable (Y)Numeric / CategoricalThe outcome variable
Covariates (optional)Numeric / CategoricalOptional control variables
ParameterDefaultOptions
Bootstrap Samples50001000 - 10000

Output Metrics

MetricWhat it means
Outcome VariableName of the dependent variable.
Predictor VariableName of the independent variable.
Mediator VariableName of the mediating variable.
NSample size.
N BootstrapNumber of bootstrap resamples used.
Mediator Model R-squaredR-squared for the regression of M on X (path a model).
Mediator Model FF-statistic for the mediator model.
Path a (X to M)Coefficient for the effect of X on M. With SE, t, and p-value.
Outcome Model R-squaredR-squared for the regression of Y on X and M (paths b and c' model).
Outcome Model FF-statistic for the outcome model.
Path b (M to Y, controlling for X)Coefficient for the effect of M on Y, holding X constant.
Path c' (Direct Effect)Effect of X on Y after controlling for M. With SE, t, p, and bootstrap CI.
Indirect Effect (a*b)Product of paths a and b. The effect of X on Y transmitted through M. With bootstrap SE and CI.
Total Effect (c)Total effect of X on Y (direct + indirect). With SE, t, and p-value.
Proportion MediatedIndirect effect / total effect. The fraction of the total effect that passes through M. With bootstrap CI.
Sobel Test EffectSobel test statistic for the indirect effect (normal-theory approximation).
Sobel Test SEStandard error for the Sobel test.
Sobel Test zZ-statistic for the Sobel test.
Sobel Test pP-value for the Sobel test.

Interpretation

  • If the bootstrap confidence interval for the indirect effect (a*b) excludes zero, mediation is statistically significant. This is the recommended test.
  • The Sobel test is a traditional alternative but assumes normality of the indirect effect distribution, which is often violated. Prefer the bootstrap CI.
  • The direct effect (c') represents the portion of X's effect on Y that does not pass through M. If c' is non-significant but the indirect effect is significant, this suggests full mediation.
  • Proportion mediated tells you how much of the total effect goes through the mediator. For example, a proportion of 0.40 means 40% of the effect is mediated.
  • A significant total effect (c) is not required for mediation to be significant. Indirect-only mediation can occur when the direct and indirect effects have opposite signs (suppression).

Common Pitfalls

  • Mediation analysis cannot prove causation from cross-sectional data. Longitudinal designs with temporal ordering provide stronger evidence.
  • Unmeasured confounders of the M-Y relationship are a major threat. Sensitivity analyses (e.g., Imai's rho) can assess how robust the findings are.
  • The proportion mediated is unstable when the total effect is near zero. Avoid interpreting it as a precise quantity in such cases.
  • Multiple mediators operating in parallel or in series require more complex models (e.g., Hayes Model 6 for serial mediation). Model 4 assumes a single mediator path.

How It Works

  1. Fit the mediator model: regress M on X (and covariates) to obtain path a.
  2. Fit the outcome model: regress Y on both X and M (and covariates) to obtain path b (M to Y) and path c' (direct effect of X on Y).
  3. Compute the indirect effect as the product a * b.
  4. Use bootstrapping: resample the data many times, re-estimate the indirect effect each time, and construct a percentile confidence interval from the bootstrap distribution.

Citations

References

  • Baron, R. M., & Kenny, D. A. (1986). The moderator-mediator variable distinction in social psychological research. Journal of Personality and Social Psychology, 51(6), 1173-1182.
  • Sobel, M. E. (1982). Asymptotic confidence intervals for indirect effects in structural equation models. Sociological Methodology, 13, 290-312.
  • Hayes, A. F. (2022). Introduction to Mediation, Moderation, and Conditional Process Analysis (3rd ed.). Guilford Press (PROCESS framework).