NSIP

Resources

Title
The Juicebox Assembly Tools module facilitates de novo assembly of mammalian genomes with chromosome-length scaffolds for under $1000
Author(s)
Dudchenko, Olga;Shamim, Muhammad S.;Batra, Sanjit;Durand, Neva C.;Musial, Nathaniel T.;Mostofa, Ragib;Pham, Melanie;Glenn St Hilaire, Brian;Yao, Weijie;Stamenova, Elena;Hoeger, Marie;Nyquist, Sarah K.;Korchina, Valeriya;Pletch, Kelcie;Flanagan, Joseph P.;Tomaszewicz, Ania;McAloose, Denise;Pérez Estrada, Cynthia;Novak, Ben J.;Omer, Arina D.;Aiden, Erez Lieberman
Published
2018
Publisher
bioRxiv
Published Version DOI
https://doi.org/10.1101/254797
Abstract
Hi-C contact maps are valuable for genome assembly (Lieberman-Aiden, van Berkum et al. 2009; Burton et al. 2013; Dudchenko et al. 2017). Recently, we developed Juicebox, a system for the visual exploration of Hi-C data (Durand, Robinson et al. 2016), and 3D-DNA, an automated pipeline for using Hi-C data to assemble genomes (Dudchenko et al. 2017). Here, we introduce "Assembly Tools," a new module for Juicebox, which provides a point-and-click interface for using Hi-C heatmaps to identify and correct errors in a genome assembly. Together, 3D-DNA and the Juicebox Assembly Tools greatly reduce the cost of accurately assembling complex eukaryotic genomes. To illustrate, we generated de novo assemblies with chromosome-length scaffolds for three mammals: the wombat, Vombatus ursinus (3.3Gb), the Virginia opossum, Didelphis virginiana (3.3Gb), and the raccoon, Procyon lotor (2.5Gb). The only inputs for each assembly were Illumina reads from a short insert DNA-Seq library (300 million Illumina reads, maximum length 2x150 bases) and an in situ Hi-C library (100 million Illumina reads, maximum read length 2x150 bases), which cost <$1000.

Access Full Text

A full-text copy of this article may be available. Please email the WCS Library to request.




Back

PUB23263