Mann-Whitney U Test

Hypothesis Testing

A non-parametric test that compares the distributions of two independent groups by ranking all observations together, without assuming normality.

When to Use

Use this test when you want to compare two independent groups but cannot assume normality, when data are ordinal, or when sample sizes are small. For example, comparing patient satisfaction ratings (on a Likert scale) between two hospitals, or comparing reaction times when the data are heavily skewed.

Assumptions

Observations in each group are independent.
The dependent variable is at least ordinal (ranks are meaningful).
The two groups have similarly shaped distributions if you want to interpret the test as a comparison of medians. If shapes differ, the test compares stochastic dominance.

Required Inputs

Input	Type	Notes
Group 1	Numeric	Values for the first group
Group 2	Numeric	Values for the second group

Parameter	Default	Options
Alternative	two-sided	two-sided / less / greater

Output Metrics

Metric	What it means
N	Number of observations in each group.
Median	Median value in each group.
Rank Sum	Sum of the assigned ranks in each group.
Mann-Whitney U	The Mann-Whitney U statistic.
Expected U (H0)	Expected value of U if the null hypothesis is true.
Std Dev U (H0)	Standard deviation of U under the null hypothesis.
Z	Standardised test statistic (normal approximation).
Pr > \|Z\|	Two-tailed p-value based on the normal approximation.
Effect Size (r)	Effect size: proportion of favourable pairs minus unfavourable pairs. Ranges from -1 to +1.
Median Difference	Difference between group medians.

Interpretation

If Pr > |Z| is less than alpha, the two groups differ significantly in their rank distributions.
The rank-biserial correlation (r) quantifies the effect size. easyCris interprets it as negligible (<0.1), small (<0.3), medium (<0.5), or large (>=0.5).
A positive rank-biserial r indicates that Group 1 tends to have higher values than Group 2.
The median difference provides a practical summary, but remember the test is based on ranks, not medians directly.
With very small samples, consider using an exact p-value rather than the normal approximation.

Common Pitfalls

The test does not compare medians unless the two distributions have the same shape. With different shapes, a significant result means one group tends to produce larger values.
Tied values reduce the precision of the test. The continuity correction helps but does not fully resolve the issue.
The normal approximation for the p-value becomes less accurate with very small sample sizes (N < 10 per group).

How It Works

Combine all observations from both groups and assign ranks from smallest to largest.
Sum the ranks for each group separately.
Compute the U statistic as the number of times an observation from Group 1 precedes an observation from Group 2 in the combined ranking.
Standardise U to a Z-score using the expected value and standard deviation under the null hypothesis, then compute the p-value from the normal distribution.

Citations

References

Mann, H. B., & Whitney, D. R. (1947). On a test of whether one of two random variables is stochastically larger than the other. Annals of Mathematical Statistics, 18(1), 50-60.
Wilcoxon, F. (1945). Individual comparisons by ranking methods. Biometrics Bulletin, 1(6), 80-83.

One-Sample t-Test

Wilcoxon Signed-Rank Test