Skip to content

Kaplan-Meier Survival Analysis

Survival Analysis

Estimates the survival function from time-to-event data, accounting for censored observations, and optionally compares survival curves between groups using the log-rank test.

When to Use

Use this analysis when you have time-to-event data with censoring and want to estimate the probability of surviving beyond a given time point. For example, estimating overall survival in a cancer trial, comparing time to relapse between treatment groups, or analysing time to equipment failure.

Assumptions

  • Censoring is non-informative: censored subjects have the same survival prospects as those who remain in the study at the same time point.
  • Survival probability depends only on time since the origin event, not on calendar time.
  • Events are independent across subjects.

Required Inputs

InputTypeNotes
Time to EventNumericSurvival or follow-up time for each subject
Event IndicatorBinary (0/1)1 = event occurred, 0 = censored
Group (optional)CategoricalOptional grouping variable for comparing survival curves

Output Metrics

MetricWhat it means
NTotal number of subjects in each group.
N EventsNumber of subjects who experienced the event.
N CensoredNumber of subjects who were censored (event not observed).
Median SurvivalTime at which the survival probability first crosses 0.50. May not be estimable if fewer than 50% of subjects experienced the event.
95% CL Lower (Median)Lower confidence limit for the median survival time.
95% CL Upper (Median)Upper confidence limit for the median survival time.
Quartile EstimatesSurvival times at which S(t) crosses 0.75 and 0.25, when estimable.
Log-Rank Chi-SquareTest statistic for comparing survival curves between groups.
Log-Rank DFDegrees of freedom for the log-rank test (number of groups - 1).
Log-Rank Pr > Chi-SquareP-value for the log-rank test.
Survival Probability TableEstimated S(t) at each event time, with confidence intervals.

Interpretation

  • The survival curve shows the estimated probability of surviving beyond each time point. A steep drop indicates many events occurring in a short period.
  • Median survival is the time at which 50% of subjects have experienced the event. It is the most commonly reported summary measure.
  • The log-rank test compares entire survival curves between groups. A significant result (p < alpha) means the groups have different survival experiences.
  • Confidence intervals for the survival function are typically computed using the log-log transformation, which ensures they stay within [0, 1].
  • Kaplan-Meier is descriptive and does not adjust for confounders. Use Cox regression to adjust for covariates.

Common Pitfalls

  • If censoring is informative (e.g., sicker patients drop out), the Kaplan-Meier estimate will be biased upward (overestimate survival).
  • Median survival cannot be estimated if fewer than half the subjects have experienced the event. Extending follow-up is the only solution.
  • The log-rank test has optimal power when hazards are proportional. If survival curves cross, the log-rank test may miss significant differences.
  • Late censoring can leave very few subjects at risk, producing unreliable estimates in the tail of the curve. Report the number at risk alongside the survival curve.

How It Works

  1. At each event time, calculate the conditional probability of surviving past that time: (number at risk - number of events) / number at risk.
  2. Multiply these conditional probabilities cumulatively to get S(t), the survival function.
  3. Censored observations reduce the number at risk but do not count as events.
  4. For the log-rank test: at each event time, compute the expected number of events in each group under the null hypothesis of equal survival, sum the (observed - expected) differences, and compare to a chi-square distribution.

Citations

References

  • Kaplan, E. L., & Meier, P. (1958). Nonparametric estimation from incomplete observations. Journal of the American Statistical Association, 53(282), 457-481.
  • Mantel, N. (1966). Evaluation of survival data and two new rank order statistics arising in its consideration. Cancer Chemotherapy Reports, 50(3), 163-170.