Skip to main content

Identifying differential expression in count data from high-throughput sequencing

Monday, October 24, 2011 — Poster Session I

Noon – 2:00 p.m.

Natcher Conference Center




  • D Russ
  • S Glanowski
  • C Johnson


Current commonly used statistical tests for differential expression of count data from high-throughput sequencing are usually based on Fisher’s exact test, which may be yielding unrealistically low p-values. DESeq and edgeR are widely used software packages for analysis of RNA sequence counts from high throughput sequencing. MicroRNA count data from peripheral blood samples of patients with age-related macular degeneration (AMD) and control samples, provided by the Swaroop lab (NEI), show large variations in counts within the AMD and control groups; however, the p-values from DESeq and egdeR for differential expression are well below 1%. We propose a different test of differential expression based on the confidence interval at a set confidence level. The confidence interval test is more conservative. We present results showing the p-values based on DESeq and edgeR are overly optimistic, and that the confidence interval test gives a more realistic estimate of the p-value.

back to top