Kruskal-Wallis H Test

Hypothesis Testing

A non-parametric test that compares the distributions of three or more independent groups using ranks, serving as the non-parametric alternative to one-way ANOVA.

When to Use

Use this test when you want to compare three or more independent groups but cannot assume normality or when data are ordinal. For example, comparing patient satisfaction scores (Likert scale) across four hospital departments, or comparing survival times when the data are heavily skewed.

Assumptions

Observations in each group are independent.
The dependent variable is at least ordinal.
Groups have similarly shaped distributions if you want to interpret results as comparing medians.

Required Inputs

Input	Type	Notes
Groups	Numeric (multiple columns)	Two or more columns, each representing one group

Output Metrics

Metric	What it means
H (Chi-Square)	The Kruskal-Wallis H statistic, which follows a chi-square distribution under the null hypothesis.
DF	Degrees of freedom (number of groups - 1).
Pr > Chi-Square	P-value for the overall test.
Epsilon-Squared	Effect size measure: H / (N-1). Ranges from 0 to 1.
N	Number of observations in each group.
Median	Median value in each group.
Sum Ranks	Sum of ranks for each group.
Mean Rank	Average rank in each group.
Expected	Expected mean rank under the null hypothesis.
Std	Standard deviation of ranks for each group.
Dunn's Z	Z-statistic for each pairwise comparison in Dunn's post-hoc test.
Dunn's Pr > \|Z\|	Adjusted p-value for each pairwise comparison.

Interpretation

If Pr > Chi-Square is less than alpha, at least one group distribution differs significantly from the others.
Use Dunn's post-hoc test to identify which specific group pairs differ.
Epsilon-squared provides a standardised effect size that is independent of sample size.
The mean ranks indicate which groups tend to have higher or lower values. A group with a higher mean rank tends to have larger observations.

Common Pitfalls

Like one-way ANOVA, a significant Kruskal-Wallis result only tells you that groups differ somewhere. Always follow up with post-hoc pairwise comparisons.
The chi-square approximation for the H statistic is less accurate with small samples (< 5 per group). Consider exact tests for very small samples.
Tied values can affect the H statistic. A correction for ties is applied automatically but many ties reduce the test's sensitivity.

How It Works

Combine all observations from all groups and rank them from smallest to largest.
Calculate the mean rank for each group.
Compute the H statistic, which measures how much the group mean ranks deviate from the overall mean rank, weighted by group sizes.
Compare H to the chi-square distribution with (k-1) degrees of freedom to obtain the p-value.

Citations

References

Kruskal, W. H., & Wallis, W. A. (1952). Use of ranks in one-criterion variance analysis. Journal of the American Statistical Association, 47(260), 583-621.
Dunn, O. J. (1964). Multiple comparisons using rank sums. Technometrics, 6(3), 241-252.

One-Way ANOVA

Two-Way ANOVA