Wilcoxon Test
What it is
The Wilcoxon test is a family of nonparametric hypothesis tests used to compare two groups when data do not meet the assumptions required for parametric tests (for example, normality). There are two commonly used versions:
– Wilcoxon signed-rank test — for comparing two related (paired) samples.
– Wilcoxon rank-sum test (often equivalent to the Mann–Whitney U test) — for comparing two independent samples.
The tests were introduced by Frank Wilcoxon (1945) and are widely used for ordinal data or continuous data that are not normally distributed.
Explore More Resources
When to use it
Use a Wilcoxon test when:
– Data are ordinal or continuous but not plausibly normally distributed.
– You want a test that is robust to non-normality.
– For paired observations (before/after, matched subjects) use the signed-rank test.
– For two independent groups use the rank-sum (Mann–Whitney) test.
Wilcoxon signed-rank test (paired)
Purpose: Test whether the median difference between paired observations is zero (i.e., whether the two related samples differ).
Explore More Resources
Assumptions:
– Pairs are dependent (same subject measured twice or matched pairs).
– Differences are continuous and can be ranked.
– Differences are symmetrically distributed around the median (weaker than normality).
How to calculate (signed-rank W):
1. For each pair, compute the difference Di (measurement1 − measurement2).
2. Remove pairs with Di = 0 (they contribute no information). Let n be the number of non-zero differences.
3. Take absolute values |Di| and rank them from 1 (smallest) to n (largest). For ties, assign average ranks.
4. Restore the original signs to the ranks (+ or − depending on the sign of Di).
5. Sum the ranks with the positive sign; this sum is the test statistic W+ (often reported as W).
6. Compare W (or the smaller of W+ and W−) to the appropriate reference distribution (exact table for small n or normal approximation for larger n) to obtain a p-value.
Explore More Resources
Notes:
– Tied or zero differences affect ranking; software typically handles this.
– The test rejects the null if the observed rank-sum is sufficiently extreme, indicating a systematic difference between paired measurements.
Wilcoxon rank-sum test (independent; Mann–Whitney U)
Purpose: Test whether two independent samples come from the same distribution (commonly interpreted as a difference in central tendency).
Explore More Resources
Assumptions:
– Samples are independent.
– Observations are at least ordinal and can be ranked.
– The two distributions have the same shape (for median comparisons); otherwise the test detects general stochastic dominance.
How to calculate (overview):
1. Combine all observations from both groups and rank them from smallest to largest (average ranks for ties).
2. Sum the ranks for each group (R1, R2).
3. Compute the U statistic from the rank sums (or use R directly); U can be converted to a z-score for large samples.
4. Obtain a p-value from the U distribution (exact for small samples, normal approximation for large samples).
Explore More Resources
Interpretation:
– A small p-value suggests the two groups differ in distribution (often interpreted as differing medians when shapes are similar).
Comparison with the t-test
- t-tests (paired or independent) assume normally distributed data (or large samples invoking the central limit theorem) and typically assume equal variances for the independent-samples t-test.
- Wilcoxon tests do not assume normality and are more appropriate for skewed or ordinal data.
- When data meet t-test assumptions, the t-test often has greater power; when assumptions fail, Wilcoxon tests are more reliable.
Practical considerations
- Use statistical software or spreadsheets to compute exact p-values, handle ties, and apply continuity corrections when appropriate.
- For very small samples, exact methods are preferred; for moderate to large samples, normal approximations are common.
- State clearly whether the test is one-sided or two-sided when reporting results.
Bottom line
The Wilcoxon tests provide robust, nonparametric alternatives to paired and independent-samples t-tests. Use the signed-rank test for paired data and the rank-sum (Mann–Whitney) test for independent groups when data violate parametric assumptions or are ordinal.