• Software
    • DNASTAR LASERGENE
      Comprehensive Sequence Analysis
      • Lasergene Molecular Biology
      • Lasergene Genomics
      • Lasergene Protein
    • NOVA APPLICATIONS
      Protein Modeling
      • NovaFold
      • NovaFold Antibody
      • NovaDock
  • Workflows
    • Molecular Biology Workflows
      • Automated Virtual Cloning
      • Clone Sequence Verification
      • Gel Electrophoresis Simulation
      • Multiple Sequence Alignment
      • Pairwise Sequence Alignment
      • PCR Site-Directed Mutagenesis
      • PCR Primer Design
      • Phylogenetic Analysis
      • Plasmid Maps
      • Sanger Sequence Assembly
      • Sequence Editing and Annotation
  • Protein Analysis
    • Antibody Modeling
    • Epitope Prediction
    • Protein Docking
    • Protein Sequence Analysis
    • Protein Stability Prediction
    • Protein Structural Alignment
    • Protein Structure Analysis
    • Protein Structure Prediction
  • Genomics
    • Clinical Research
    • De Novo Genome Assembly
    • Mauve Genome Alignment
    • Metagenomic Assembly
    • Variant Analysis
    • Whole Genome/Whole Exome
  • Transcriptomics
    • ChIP-Seq Data Analysis
    • De Novo Transcriptome Assembly
    • RNA-Seq Alignment
  • Services
    • Protein Services
    • Genomic Services
  • Pricing
    • Academic Pricing
    • Commercial Pricing
    • Lasergene Student Licenses
    • Request a Quote
  • Resources
    • COVID-19
    • Product Updates
    • Product Notifications
    • Blog
    • Educational Software Request
    • Events
    • Documentation
    • Grant Assistance
    • Technical Requirements
      • File Formats
      • Licensing Options
  • Training
    • Help + Tutorials
    • Webinars
    • Video Library Archives
    • Technical Support Request
  • About
    • Careers
    • Distributors
    • Legal Information
    • Privacy Policy
  • Contact
  • Languages
    • English
    • العربية
    • 日本語
    • 한국어
    • 简体中文
    • Deutsch
    • Español
    • Francais
    • Português – Portugal
    • Português – Brasil
  • LANGUAGE
    • English
    • العربية
    • 日本語
    • 한국어
    • 简体中文
    • Deutsch
    • Español
    • Francais
    • Português – Portugal
    • Português – Brasil

QUESTIONS? CALL 866.511.5090

DOWNLOAD FREE TRIAL
SHOPPING CART
MY ACCOUNT
DNASTAR DNASTAR
  • Software
    • DNASTAR LASERGENE
      Comprehensive Sequence Analysis
      • Lasergene Molecular Biology
      • Lasergene Genomics
      • Lasergene Protein
    • NOVA APPLICATIONS
      Protein Modeling
      • NovaFold
      • NovaFold Antibody
      • NovaDock
  • Workflows
    • Molecular Biology
      • Automated Virtual Cloning
      • Clone Sequence Verification
      • Gel Electrophoresis Simulation
      • Multiple Sequence Alignment
      • Pairwise Sequence Alignment
      • PCR Site-Directed Mutagenesis
      • PCR Primer Design
      • Phylogenetic Analysis
      • Plasmid Maps
      • Sanger Sequence Assembly
      • Sequence Editing and Annotation
    • Protein Analysis
      • Antibody Modeling
      • Epitope Prediction
      • Protein Docking
      • Protein Sequence Analysis
      • Protein Stability Prediction
      • Protein Structural Alignment
      • Protein Structure Analysis
      • Protein Structure Prediction
    • Genomics
      • Clinical Research
      • De Novo Genome Assembly
      • Mauve Genome Alignment
      • Metagenomic Assembly
      • Variant Analysis
      • Whole Exome/Genome Sequencing
    • Transcriptomics
      • ChIP-Seq Data Analysis
      • De Novo Transcriptome Assembly
      • RNA-Seq Alignment and Analysis
  • Services
    • Protein Services
    • Genomic Services
  • Pricing
    • Academic Pricing
    • Commercial Pricing
    • Lasergene Student Licenses
    • Request a Quote
  • Resources
    • COVID-19
    • Product Updates
    • Product Notifications
    • Blog
    • Educational Software Request
    • Events
    • Documentation
    • Grant Assistance
    • Technical Requirements
      • File Formats
      • Licensing Options
  • Training
    • Help + Tutorials
    • Webinars
    • Video Library Archives
    • Technical Support Request
  • About
    • Careers
    • Distributors
    • Legal Information
    • Privacy Policy
  • Contact

How much disk space do I need for my templated genome assembly?

How much disk space do I need for my templated genome assembly?

November 24, 2015 Next-Gen Sequencing

Previously,  we’ve described the RAM dependencies for de novo genome assemblies, which show a linear relationship with genome size. Extrapolation for assembling large, eukaryotic genomes would suggest RAM requirements measured in terabytes, but quite often de novo assembly is not the goal. Rather, you wish to align large numbers of reads against a known reference genome in order to detect variation relative to said reference. For such applications, Lasergene Genomics Suite and SeqMan NGen use DNASTAR’s patented disk sort alignment (DSA) algorithm, greatly reducing RAM requirements. In using the DSA, the software uses disk space for temporary files required during the assembly process. The speed of the drives will impact the time of the assembly, and for maximum efficiency, we recommend having one drive for the input data and result files, and a separate drive for the temporary files.

The question then arises as to how much disk space will be needed for a templated assembly. Many of the same factors that impact RAM usage for de novo assembly influence the disk space requirements for a templated assembly, including: the genome size and complexity, the number of reads, the read length, and the read accuracy. The choice of an appropriate reference sequence is also critical due to the potential for misalignment, and to avoid elimination of critical sequences that cannot be aligned with confidence.

Templated Assembly Disk Requirements

We collected sufficient Illumina 2×100 paired end data sets from the Sequence Read Archive (SRA) to provide approximately 40x coverage for a range of genome sizes from E. coli to H. sapiens. Data was assembled against the corresponding reference genomes using SeqMan NGen, monitoring the disk space utilized during the process. The measured disk space does not include input data (reads and reference genome) but it does include both temporary files and final result files. For non-microbial organisms, the graph suggests a rule of thumb of allowing about 0.5 – 0.7 GB of disk space per Mb of genome length.  As shown in the graph, a human genome can be aligned against a reference genome using SeqMan NGen on a computer having as little as 2 terabytes of hard disk space available.

Learn more about reference-guided genome assembly in Lasergene and see benchmarks for various assembly times here.

0
Share

4 Comments

Leave your reply.
  • Suliman Basit
    · Reply

    December 1, 2015 at 11:46 AM

    I would like to ask
    How much RAM is required to align whole exonerated data of 100x coverage against the human genome?
    How much time it takes in a system with 32GB RAM?

    • Katie Maxfield
      · Reply

      December 1, 2015 at 6:58 PM

      Suliman, thanks for your questions.
      For all reference-guided alignments, we recommend at least 16 GB of RAM. See our Technical Requirements for more info. In general, an exome alignment like you describe should take less than 2 hours to complete – see our Exome Alignment page for the latest benchmarks.

  • Dr A R Pradeep
    · Reply

    December 1, 2015 at 12:40 PM

    interested to know more about how to do genome assembly in a desktop computer and what programs are available with DNA star

    • Katie Maxfield
      · Reply

      December 1, 2015 at 7:00 PM

      Dr. Pradeep, our Lasergene Genomics Suite has all the software you need to assemble and analyze whole genome sequencing data. You can request a free trial here.

Leave a Reply

Your email is safe with us.
Cancel Reply

Search Blog Posts

Categories

  • Best Practices
  • Clinical Research
  • DNASTAR Customer Stories
  • DNASTAR News
  • Events
  • Long Read Sequencing
  • Molecular Biology
  • Newsletters
  • Next-Gen Sequencing
  • Press Releases
  • Product Notifications
  • Product Updates
  • Publications
  • Resources
  • Structural Biology
  • Webinars
  • Workflows

Recent Posts

  • Answers to your “Phylogenetic Tree” webinar questions March 5, 2021
  • Webinar: Mastering Phylogenetic Tree Creation & Optimization with MegAlign Pro February 25, 2021
  • February 11, 2021 Newsletter – Phylogenetic Tree Webinar, Improvements to SeqMan Ultra, Lasergene 17.2.1 Available for Download February 11, 2021
  • Q&A with Senior Product Manager Matt Keyser February 1, 2021
  • Lasergene 17.2.1 Release Notes January 20, 2021

Tags

assembling sequences cloud Cloud Assemblies customers De Novo Assembly DNASTAR Genomics Lasergene Metagenomics Metagenomic Sequencing NCBI GenBank newsletters next-gen NGS NGS Sequence Alignment NGS Sequence Asembly publications seqbuilder pro SeqMan NGen sequence assembly Webinar

Archives

Find us on

Most Commented Posts

  • EditSeq, PrimerSelect and classic MegAlign retired with the release of Lasergene 16.0 By Sharon Yildiz on July 12, 2019 4
  • How much disk space do I need for my templated genome assembly? By DNA STAR on November 24, 2015 4
  • Mac OS X El Capitan and Lasergene Compatibility By toms on October 21, 2015 2
Would you like to receive technical tips and special offers straight to your inbox? YES, SIGN ME UP!
  • Pricing
  • Software
  • Workflows
  • Resources
  • Training
  • About

Get a 14-Day free trial of our complete Lasergene package. Try before you buy!

FREE TRIAL DOWNLOAD

© 2021 — DNASTAR Privacy Policy

Prev Next
This website uses cookies to improve user experience and understand our web usage. By continuing to use our website, you consent to our use of cookies. Accept
Privacy & Cookies Policy
Necessary
Always Enabled