Skip to main content
 

Canu: a new PacBio and Nanopore assembler for genomes large and small

Friday, September 16, 2016 — Poster Session IV

12:00 p.m. – 1:30 p.m.
FAES Terrace
NHGRI
COMPBIO-7

Authors

  • S Koren
  • BP Walenz
  • K Berlin
  • AM Phillippy

Abstract

Single-molecule sequencing is now routinely used assemble complete, high-quality microbial genomes, but these assembly methods have not scaled well to large genomes. To address this problem, we previously introduced the MinHash Alignment Process for assembling single-molecule, noisy reads. This has enabled reference-grade assemblies of model organisms, revealed novel heterochromatic sequences, and filled low-complexity gaps. We have built on this work, creating a new assembler named Canu, optimized for single-molecule sequencing. Canu represents a complete refactorization of the Celera Assembler, shrinking the code base by over 70%, lowering coverage requirements, enhancing reconstruction of diploid genomes, and more efficiently assembling repetitive sequences. Canu has generated single-chromosome assemblies of E. coli, B. anthracis, B. cereus, and Y. pestis with over 99% identity using Oxford Nanopore sequencing. Using older chemistry, the eukaryote S. cerevisiae assembly exceeds 500Kb NG50 and we predict chromosome-scale assembly with recent chemistries. Canu is available at https://github.com/marbl/canu.

Category: Computational Biology