Skip to content

Fisher's Exact Test

Categorical Analysis

An exact test of association for 2x2 contingency tables that computes the p-value directly from the hypergeometric distribution, without relying on large-sample approximations.

When to Use

Use this test for 2x2 contingency tables, especially when expected cell counts are small (< 5) and the chi-square approximation is unreliable. For example, testing whether a rare adverse event is associated with treatment group in a clinical trial with few events.

Assumptions

  • The contingency table is 2x2 (two binary variables).
  • Observations are independent.
  • The marginal totals (row and column sums) are fixed by the study design, or treated as fixed for the purpose of the test.

Required Inputs

InputTypeNotes
Variable 1Categorical (2 levels)First binary variable
Variable 2Categorical (2 levels)Second binary variable

Output Metrics

MetricWhat it means
Odds RatioThe ratio of odds: (a*d) / (b*c) for a 2x2 table with cells a, b, c, d. An OR of 1 indicates no association.
p-value (two-tailed)Exact two-tailed p-value computed from the hypergeometric distribution.
Conditional p-valueOne-tailed exact p-value, if directional testing is appropriate.
OR 95% CI LowerLower bound of the 95% confidence interval for the odds ratio.
OR 95% CI UpperUpper bound of the 95% confidence interval for the odds ratio.

Interpretation

  • If the p-value is less than alpha, there is a statistically significant association between the two binary variables.
  • An odds ratio > 1 indicates that the first category of Variable 1 is associated with higher odds of the first category of Variable 2 (and vice versa for OR < 1).
  • The confidence interval for the odds ratio provides a range of plausible values. If it includes 1, the association is not significant.
  • Fisher's exact test is valid for any sample size, but it is most commonly used when samples are small.

Common Pitfalls

  • Fisher's exact test is computationally intensive for large tables. For large samples, the chi-square test is equivalent and faster.
  • The odds ratio from a 2x2 table can be misleading if the outcome is common (> 10%), because the odds ratio overestimates the relative risk.
  • The two-tailed p-value is computed by summing probabilities of all tables as extreme or more extreme than observed. Different software may use slightly different definitions of "extreme".

How It Works

  1. Given the 2x2 table with fixed marginal totals, enumerate all possible tables that have the same margins.
  2. Compute the probability of each table using the hypergeometric distribution.
  3. Sum the probabilities of all tables that are as extreme as or more extreme than the observed table to get the p-value.

Citations

References

  • Fisher, R. A. (1922). On the interpretation of chi-square from contingency tables, and the calculation of P. Journal of the Royal Statistical Society, 85(1), 87-94.