RNA-seq read estimator
A rough planning estimate of reads needed per sample for RNA-seq, from the number of features and a target average reads per feature. Not a power calculation.
How it works
Formula
reads ≈ (features × target reads per feature) ÷ usable fraction. The usable fraction accounts for reads lost to rRNA, multi-mapping and QC.
Worked example
Human (~20,000 genes), 100 reads per gene on average, 80% usable: (20,000 × 100) ÷ 0.8 = 2,500,000 reads. (For differential expression many labs target 20–30M reads/sample, i.e. a higher reads-per-gene.)
When to use it
For a quick, order-of-magnitude planning figure when budgeting an RNA-seq run. This is NOT a statistical power calculation and ignores the highly skewed real expression distribution — low-expression genes need far more depth than the average implies.
Sensible defaults
Defaults assume the human protein-coding gene set (~20,000), 100 reads/gene and 80% usable reads. The preset gene counts are approximate round figures; set your own with the custom option.
FAQ
- Is this a replacement for a power analysis?
- No. It is a rough planning heuristic. For detecting differential expression at a chosen effect size and FDR, use a dedicated RNA-seq power tool that models dispersion and the expression distribution.
- Where do the preset gene counts come from?
- They are approximate protein-coding gene counts (~20,000 human, ~22,000 mouse). Use the custom field for a specific annotation, exome, or panel.