Mann-Whitney U Test
Hypothesis TestingA non-parametric test that compares the distributions of two independent groups by ranking all observations together, without assuming normality.
When to Use
Use this test when you want to compare two independent groups but cannot assume normality, when data are ordinal, or when sample sizes are small. For example, comparing patient satisfaction ratings (on a Likert scale) between two hospitals, or comparing reaction times when the data are heavily skewed.
Assumptions
- Observations in each group are independent.
- The dependent variable is at least ordinal (ranks are meaningful).
- The two groups have similarly shaped distributions if you want to interpret the test as a comparison of medians. If shapes differ, the test compares stochastic dominance.
Required Inputs
| Input | Type | Notes |
|---|---|---|
| Group 1 | Numeric | Values for the first group |
| Group 2 | Numeric | Values for the second group |
| Parameter | Default | Options |
|---|---|---|
| Alternative | two-sided | two-sided / less / greater |
Output Metrics
| Metric | What it means |
|---|---|
| N | Number of observations in each group. |
| Median | Median value in each group. |
| Rank Sum | Sum of the assigned ranks in each group. |
| Mann-Whitney U | The Mann-Whitney U statistic. |
| Expected U (H0) | Expected value of U if the null hypothesis is true. |
| Std Dev U (H0) | Standard deviation of U under the null hypothesis. |
| Z | Standardised test statistic (normal approximation). |
| Pr > |Z| | Two-tailed p-value based on the normal approximation. |
| Effect Size (r) | Effect size: proportion of favourable pairs minus unfavourable pairs. Ranges from -1 to +1. |
| Median Difference | Difference between group medians. |
Interpretation
- If Pr > |Z| is less than alpha, the two groups differ significantly in their rank distributions.
- The rank-biserial correlation (r) quantifies the effect size. easyCris interprets it as negligible (<0.1), small (<0.3), medium (<0.5), or large (>=0.5).
- A positive rank-biserial r indicates that Group 1 tends to have higher values than Group 2.
- The median difference provides a practical summary, but remember the test is based on ranks, not medians directly.
- With very small samples, consider using an exact p-value rather than the normal approximation.
Common Pitfalls
- The test does not compare medians unless the two distributions have the same shape. With different shapes, a significant result means one group tends to produce larger values.
- Tied values reduce the precision of the test. The continuity correction helps but does not fully resolve the issue.
- The normal approximation for the p-value becomes less accurate with very small sample sizes (N < 10 per group).
How It Works
- Combine all observations from both groups and assign ranks from smallest to largest.
- Sum the ranks for each group separately.
- Compute the U statistic as the number of times an observation from Group 1 precedes an observation from Group 2 in the combined ranking.
- Standardise U to a Z-score using the expected value and standard deviation under the null hypothesis, then compute the p-value from the normal distribution.
Citations
References
- Mann, H. B., & Whitney, D. R. (1947). On a test of whether one of two random variables is stochastically larger than the other. Annals of Mathematical Statistics, 18(1), 50-60.
- Wilcoxon, F. (1945). Individual comparisons by ranking methods. Biometrics Bulletin, 1(6), 80-83.