Chi-Square Test of Independence
Categorical AnalysisTests whether two categorical variables are independent by comparing observed frequencies in a contingency table to the frequencies expected under independence.
When to Use
Use this test when you have two categorical variables and want to determine whether there is a statistically significant association between them. For example, testing whether treatment type (drug A, drug B, placebo) is associated with patient outcome (improved, unchanged, worsened).
Assumptions
- Both variables are categorical (nominal or ordinal).
- Observations are independent (each subject contributes to exactly one cell).
- Expected frequencies are at least 5 in each cell. If any expected count is below 5, consider Fisher's exact test.
- The sample was drawn randomly from the population.
Required Inputs
| Input | Type | Notes |
|---|---|---|
| Variable 1 | Categorical | First categorical variable (rows of the contingency table) |
| Variable 2 | Categorical | Second categorical variable (columns of the contingency table) |
Output Metrics
| Metric | What it means |
|---|---|
| Chi-Square | Pearson chi-square test statistic: sum of (observed - expected)^2 / expected across all cells. |
| DF | Degrees of freedom: (rows - 1) * (columns - 1). |
| Pr > ChiSq | P-value for the chi-square test. |
| Likelihood Ratio Chi-Square | G-test statistic: alternative to Pearson chi-square based on log-likelihood ratios. |
| Cramer's V | Effect size measure for tables larger than 2x2. Ranges from 0 to 1. Thresholds: small (0.1), medium (0.3), large (0.5). |
| Phi Coefficient | Effect size for 2x2 tables. Equivalent to Pearson correlation for two binary variables. |
| Yates' Correction | Continuity-corrected chi-square for 2x2 tables (reduces Type I error for small samples). |
| Observed Frequencies | The actual counts in each cell of the contingency table. |
| Expected Frequencies | The counts expected if the two variables were independent. |
| Standardised Residuals | Standardised difference between observed and expected in each cell. Values beyond +/-2 indicate notable departures from independence. |
Interpretation
- If Pr > ChiSq is less than alpha, the two variables are significantly associated (not independent).
- Cramer's V quantifies the strength of association: small (0.1), medium (0.3), large (0.5). The chi-square test alone only tells you whether there is an association, not how strong it is.
- Examine standardised residuals to identify which cells contribute most to the significant result. Residuals > +2 or < -2 indicate cells with notably more or fewer observations than expected.
- For 2x2 tables, the phi coefficient is equivalent to the Pearson correlation and provides both direction and magnitude.
Common Pitfalls
- The chi-square test is invalid when expected frequencies are too small (rule of thumb: no expected count < 5). Use Fisher's exact test instead.
- A significant chi-square does not tell you which cells or categories are driving the association. Inspect standardised residuals or perform post-hoc pairwise comparisons.
- Very large samples can produce statistically significant results for trivially small associations. Always report the effect size.
- The test does not account for ordering in ordinal variables. For ordered categories, a trend test may be more appropriate.
How It Works
- Construct the contingency table of observed frequencies.
- Compute expected frequencies for each cell: (row total * column total) / grand total.
- Calculate the chi-square statistic as the sum of (observed - expected)^2 / expected across all cells.
- Compare the statistic to the chi-square distribution with (rows-1)*(columns-1) degrees of freedom to obtain the p-value.
Citations
References
- Pearson, K. (1900). On the criterion that a given system of deviations from the probable in the case of a correlated system of variables is such that it can be reasonably supposed to have arisen from random sampling. Philosophical Magazine, 50(302), 157-175.
- Cramér, H. (1946). Mathematical Methods of Statistics. Princeton University Press.