Skip to main content

Single-molecule sequencing and assembly of human rDNA gene clusters

Wednesday, September 14, 2016 — Poster Session I

3:00 p.m. – 4:30 p.m.
FAES Terrace


  • A Dilthey
  • S Koren
  • J-H Kim
  • R Nagaraja
  • D Schlessinger
  • V Larinov
  • A Phillippy


The ribosome plays a central role in protein biosynthesis. In humans, 3 of the 4 sub-components of the cytoplasmatic ribosome (28S, 5.8S, 18S) are encoded as a single transcription unit (45S). Hundreds of these ribosomal DNA (rDNA) genes are arranged in tandem across the short arms of the acrocentric autosomes. Besides the canonical repeat structure of ~13kb 45S copies separated by a ~34kb spacer, little is known about intra-genome and population polymorphism because the rDNA regions are too repetitive to be sequenced and assembled. As a result, GenBank currently contains only a few human rDNA sequences, and the corresponding regions in the reference genome are labeled as “undefined.” We show that these challenges can be overcome by transformation-associated recombination (TAR) cloning, followed by high-coverage, single-molecule sequencing and assembly of the resulting rDNA BACs. In a pilot study, we applied our approach to six BACs, including a reference rDNA BAC (CH507-528H12), which we were able to reconstruct de novo with >99% accuracy. Of the five additional BACs isolated via TAR cloning, three were structurally normal, one contained an inversion, and one carried three copies of the 18S but no 5.8S copies. In a multiple sequence alignment of the assembled ribosomal subunit sequences, we see no evidence of polymorphism at any 5.8S position, but evidence for variation at multiple positions of the 28S and 18S. Detailed characterization and validation of these variants, application to additional BACs, and development of whole-genome methods is in progress.

Category: Genetics and Genomics