Normality Tests
Distribution & DescriptiveRuns five complementary normality tests simultaneously (Shapiro-Wilk, Lilliefors, Anderson-Darling, Cramer-von Mises, and Jarque-Bera) to assess whether data follow a normal distribution.
When to Use
Use these tests before applying parametric methods that assume normality (t-tests, ANOVA, Pearson correlation, linear regression). For example, checking whether residuals from a regression model are normally distributed, or whether continuous measurements from a clinical trial can be analysed with a t-test.
Assumptions
- The data are a random sample from a population.
- Observations are independent.
- The variable is continuous.
Required Inputs
| Input | Type | Notes |
|---|---|---|
| Values | Numeric | Continuous data to test for normality |
Output Metrics
| Metric | What it means |
|---|---|
| Shapiro-Wilk W | Test statistic. Values close to 1 indicate normality. Most powerful for small samples (N < 50). |
| Shapiro-Wilk p-value | P-value for the Shapiro-Wilk test. |
| Kolmogorov-Smirnov D | Maximum absolute difference between empirical and theoretical normal CDFs. In easyCris this test uses the Lilliefors correction for estimated parameters. |
| Kolmogorov-Smirnov p-value | P-value for the Kolmogorov-Smirnov test as displayed in the UI; the backend method uses the Lilliefors correction. |
| Anderson-Darling A-squared | Weighted measure of departure from normality, emphasising tail behaviour. |
| Anderson-Darling p-value | P-value for the Anderson-Darling test. |
| Cramer-von Mises W-squared | Integral of the squared difference between empirical and theoretical CDFs. |
| Cramer-von Mises p-value | P-value for the Cramer-von Mises test. |
| Jarque-Bera JB | Test statistic based on skewness and excess kurtosis. Sensitive to departures in the tails. |
| Jarque-Bera p-value | P-value for the Jarque-Bera test. |
Interpretation
- If the p-value from any test is less than alpha, normality is rejected. However, consider the consensus across all five tests rather than relying on a single test.
- Shapiro-Wilk is generally the most powerful test for small to moderate samples (N < 2000) and is often considered the primary reference.
- Anderson-Darling is particularly sensitive to departures in the tails, making it useful when tail behaviour matters (e.g., for extreme value analysis).
- Jarque-Bera focuses specifically on skewness and kurtosis. A significant result pinpoints whether non-normality comes from asymmetry, heavy tails, or both.
- Always complement formal tests with a Q-Q (quantile-quantile) plot. Visual assessment provides context that p-values alone cannot.
Common Pitfalls
- With large samples (N > 500), normality tests become overly sensitive and reject normality for trivial departures that do not materially affect parametric test validity. Use Q-Q plots to judge practical significance.
- With very small samples (N < 20), normality tests have low power and may fail to detect meaningful departures from normality.
- A non-significant result does not prove normality. It only means you do not have sufficient evidence to reject it.
- Testing raw data for normality is often the wrong question. For t-tests and ANOVA, it is the residuals (or within-group distributions) that should be normal, not the overall data.
How It Works
- Shapiro-Wilk: Computes the ratio of the best linear estimate of variance to the usual variance estimate. A ratio close to 1 indicates normality.
- Lilliefors: Measures the maximum vertical distance between the empirical CDF and a normal CDF with the same mean and standard deviation.
- Anderson-Darling: Integrates the squared difference between the empirical and theoretical CDFs, with extra weight given to the tails.
- Jarque-Bera: Tests whether the sample skewness and excess kurtosis are jointly zero, which they would be for a normal distribution.
Citations
References
- Shapiro, S. S., & Wilk, M. B. (1965). An analysis of variance test for normality (complete samples). Biometrika, 52(3-4), 591-611.
- Lilliefors, H. W. (1967). On the Kolmogorov-Smirnov test for normality with mean and variance unknown. Journal of the American Statistical Association, 62(318), 399-402.
- Anderson, T. W., & Darling, D. A. (1952). Asymptotic theory of certain "goodness of fit" criteria based on stochastic processes. Annals of Mathematical Statistics, 23(2), 193-212.
- Cramér, H. (1928). On the composition of elementary errors. Scandinavian Actuarial Journal, 1928(1), 13-74.
- Jarque, C. M., & Bera, A. K. (1980). Efficient tests for normality, homoscedasticity and serial independence of regression residuals. Economics Letters, 6(3), 255-259.