Kaplan-Meier Survival Analysis
Survival AnalysisEstimates the survival function from time-to-event data, accounting for censored observations, and optionally compares survival curves between groups using the log-rank test.
When to Use
Use this analysis when you have time-to-event data with censoring and want to estimate the probability of surviving beyond a given time point. For example, estimating overall survival in a cancer trial, comparing time to relapse between treatment groups, or analysing time to equipment failure.
Assumptions
- Censoring is non-informative: censored subjects have the same survival prospects as those who remain in the study at the same time point.
- Survival probability depends only on time since the origin event, not on calendar time.
- Events are independent across subjects.
Required Inputs
| Input | Type | Notes |
|---|---|---|
| Time to Event | Numeric | Survival or follow-up time for each subject |
| Event Indicator | Binary (0/1) | 1 = event occurred, 0 = censored |
| Group (optional) | Categorical | Optional grouping variable for comparing survival curves |
Output Metrics
| Metric | What it means |
|---|---|
| N | Total number of subjects in each group. |
| N Events | Number of subjects who experienced the event. |
| N Censored | Number of subjects who were censored (event not observed). |
| Median Survival | Time at which the survival probability first crosses 0.50. May not be estimable if fewer than 50% of subjects experienced the event. |
| 95% CL Lower (Median) | Lower confidence limit for the median survival time. |
| 95% CL Upper (Median) | Upper confidence limit for the median survival time. |
| Quartile Estimates | Survival times at which S(t) crosses 0.75 and 0.25, when estimable. |
| Log-Rank Chi-Square | Test statistic for comparing survival curves between groups. |
| Log-Rank DF | Degrees of freedom for the log-rank test (number of groups - 1). |
| Log-Rank Pr > Chi-Square | P-value for the log-rank test. |
| Survival Probability Table | Estimated S(t) at each event time, with confidence intervals. |
Interpretation
- The survival curve shows the estimated probability of surviving beyond each time point. A steep drop indicates many events occurring in a short period.
- Median survival is the time at which 50% of subjects have experienced the event. It is the most commonly reported summary measure.
- The log-rank test compares entire survival curves between groups. A significant result (p < alpha) means the groups have different survival experiences.
- Confidence intervals for the survival function are typically computed using the log-log transformation, which ensures they stay within [0, 1].
- Kaplan-Meier is descriptive and does not adjust for confounders. Use Cox regression to adjust for covariates.
Common Pitfalls
- If censoring is informative (e.g., sicker patients drop out), the Kaplan-Meier estimate will be biased upward (overestimate survival).
- Median survival cannot be estimated if fewer than half the subjects have experienced the event. Extending follow-up is the only solution.
- The log-rank test has optimal power when hazards are proportional. If survival curves cross, the log-rank test may miss significant differences.
- Late censoring can leave very few subjects at risk, producing unreliable estimates in the tail of the curve. Report the number at risk alongside the survival curve.
How It Works
- At each event time, calculate the conditional probability of surviving past that time: (number at risk - number of events) / number at risk.
- Multiply these conditional probabilities cumulatively to get S(t), the survival function.
- Censored observations reduce the number at risk but do not count as events.
- For the log-rank test: at each event time, compute the expected number of events in each group under the null hypothesis of equal survival, sum the (observed - expected) differences, and compare to a chi-square distribution.
Citations
References
- Kaplan, E. L., & Meier, P. (1958). Nonparametric estimation from incomplete observations. Journal of the American Statistical Association, 53(282), 457-481.
- Mantel, N. (1966). Evaluation of survival data and two new rank order statistics arising in its consideration. Cancer Chemotherapy Reports, 50(3), 163-170.