Interpreting the Results Table

Results & Interpretation

Understand what each column in the differential expression results table means and how to identify statistically significant, biologically meaningful genes.

When to Use

Your analysis has completed and you are reviewing the results in the Results tab.
You need to filter genes by significance or fold-change thresholds, or export a gene list for downstream analysis.
You want to understand the meaning of each column before drawing biological conclusions.

Required Inputs

A completed RNA-seq run with results available in the Results tab.

What to Expect

baseMean: the average normalised count across all samples. Higher values indicate more abundant transcripts.
log2FoldChange: the effect size on the log2 scale. A value of 1 means the gene is twice as highly expressed in the test group; -1 means half.
lfcSE: the standard error of the log2 fold-change estimate. Smaller values indicate more precise estimates.
pvalue: the raw p-value from the Wald test for the null hypothesis that the true fold change is zero.
padj: the Benjamini-Hochberg adjusted p-value that controls the false discovery rate. This is the column you should use for significance calls.

Interpretation

Use padj < 0.05 as the default significance threshold. Genes passing this cutoff have a controlled false discovery rate of 5%.
Combine significance with a fold-change threshold (e.g., |log2FC| > 1) to focus on genes with both statistical and biological significance.
Genes with large fold changes but high lfcSE are imprecisely estimated -- these are often low-expression genes with noisy count data.
The volcano plot visualises -log10(padj) against log2FoldChange, placing the most significant and largest-effect genes in the upper corners.
The MA plot shows log2FoldChange against baseMean, revealing whether fold-change estimates are consistent across expression levels.

Common Pitfalls

Use padj (not pvalue) to determine significance. Raw p-values are not corrected for the thousands of simultaneous tests and will produce many false positives.
Genes with very low baseMean may show dramatic fold changes that are unreliable because they are driven by small count differences.
NA values in padj indicate genes removed by independent filtering (too little information to test). These are not errors.
Log2 fold-change shrinkage (e.g., apeglm) changes the magnitude of log2FC values but does not alter padj. Shrinkage improves gene ranking for downstream analyses like GSEA.

Citations

References

Benjamini, Y. & Hochberg, Y. (1995). Controlling the false discovery rate: a practical and powerful approach to multiple testing. Journal of the Royal Statistical Society B, 57(1), 289-300.
Zhu, A., Ibrahim, J. G., & Love, M. I. (2019). Heavy-tailed prior distributions for sequence count data: removing the noise and preserving large differences. Bioinformatics, 35(12), 2084-2092. doi:10.1093/bioinformatics/bty895.

Multi-run Comparator

PCA & Quality Control Plots