Skip to main content
 

Assess and minimize false SNVs from RNA-Seq analysis

Monday, October 24, 2011 — Poster Session I

Noon – 2:00 p.m.

Natcher Conference Center

CIT

BIOINFO-19

Authors

  • W Xiao
  • X Liu
  • R Schmitz
  • S Jhavar
  • G Wright
  • J Powell
  • L Staudt

Abstract

RNA-Seq with Next-Generation Sequencing technology has been widely used to determine the level and form of gene expression. It also can be used to discover somatic mutation in the sample of interest. Previously, we have reported the establishment of RNA-Seq pipeline to identify single-nucleotide variations (SNV), fusion transcripts, and digital expression. Here we report our attempt to further fine-tune the process for SNV detection in an effect to minimize the rate of false discovery due to possible mis-mapping of reads. We have applied genome deep sequencing using Complete Genomics (CG) technology and RNA-Seq using Illumina GAIIX technology in parallel on two samples. Using the final assembly results from CG as gold standard, we were able to assess the rate of false SNV in each sample. By applying two parallel processes that utilized difference aligning software and reference sequence data sets, we were able to decrease the rate of false SNV calls from 25-40% to 5%. With this modified process, we have performed SNV discovery in nearly 200 samples. Of all SNVs discovered in each sample, on average, more than 92% were SNPs.

back to top