Wilcoxon Signed-Rank Test
Hypothesis TestingA non-parametric test for paired observations that evaluates whether the median difference between two related conditions is significantly different from zero.
When to Use
Use this test when you have paired or matched observations and the distribution of differences is not normal. For example, comparing pain scores before and after treatment when scores are on an ordinal scale, or when the distribution of differences is skewed.
Assumptions
- Observations are paired: each measurement in one condition has a corresponding measurement in the other.
- The differences between pairs are independent of each other.
- The distribution of differences is approximately symmetric around the median (not necessarily normal).
- The dependent variable is at least ordinal.
Required Inputs
| Input | Type | Notes |
|---|---|---|
| Before / Group 1 | Numeric | First measurement values |
| After / Group 2 | Numeric | Second measurement values |
Output Metrics
| Metric | What it means |
|---|---|
| Sum of Positive Ranks | Sum of ranks for pairs where the first value exceeds the second. |
| Sum of Negative Ranks | Sum of ranks for pairs where the second value exceeds the first. |
| N Zero Differences | Number of pairs with identical values (zero differences), excluded from ranking. |
| N Pairs | Total number of pairs. |
| N Non-Zero | Number of pairs with non-zero differences (used in the test). |
| Test Statistic (S) | Signed-rank test statistic shown in the app table. |
| Expected S (H0) | Expected value of the signed-rank statistic under the null hypothesis. |
| Std Dev S (H0) | Standard deviation of the signed-rank statistic under the null hypothesis. |
| Pr >= |S| | Two-tailed p-value. |
| Rank-Biserial Correlation | Effect size measure ranging from -1 to +1. |
Interpretation
- If the p-value is less than alpha, the median difference between the two conditions is significantly different from zero.
- The rank-biserial correlation quantifies the effect size. easyCris interprets it as negligible (<0.2), small (<0.5), medium (<0.8), or large (>=0.8).
- The direction of the effect is indicated by whether the sum of positive ranks or negative ranks is larger.
- Zero differences (ties between paired values) are excluded from the test. A large number of ties reduces the effective sample size.
Common Pitfalls
- The assumption of symmetry of differences is often overlooked. If the differences are strongly asymmetric, the test can be misleading.
- With very small samples (N < 10 non-zero pairs), the normal approximation for the p-value is inaccurate. Use exact p-values when available.
- Unlike the paired t-test, this test does not use the actual magnitudes of differences, only their ranks. Large differences and small differences receive equal weight once ranked.
How It Works
- Compute the difference for each pair and discard pairs with zero differences.
- Rank the absolute values of the non-zero differences from smallest to largest.
- Assign each rank a positive or negative sign based on the sign of the original difference.
- Sum the positive ranks and negative ranks separately. The test statistic W is the smaller of the two sums (or an equivalent formulation). Compare W to its null distribution to obtain the p-value.
Citations
References
- Wilcoxon, F. (1945). Individual comparisons by ranking methods. Biometrics Bulletin, 1(6), 80-83.
- Pratt, J. W. (1959). Remarks on zeros and ties in the Wilcoxon signed rank procedures. Journal of the American Statistical Association, 54(287), 655-667.