The Input Reference Sequences, Input Viral Genomes, Input Host Files, and Input Biome Genomes screens allow you to specify DNASTAR genome template packages for common model organisms. Each template package contains template sequence, annotations, and database linking information. If you wish to use DNASTAR’s database association features (e.g., dbSNP, GERP, and COSMIC), you must input one of these genome packages in the appropriate screen for your workflow.

If you are performing a local assembly you can usually download and add genome template packages automatically . However, there are some circumstances in which it is necessary to download and extract the package manually prior to using the wizard.

To download a genome package:

Go to DNASTAR’s Genome Template Packages web page and download a free template package with the genome of interest. Each package contains the template sequence, annotations, and associated dbSNP linking information.

Downloaded genome packages are saved on your computer as ZIP files, and must be extracted prior to use.

To extract a downloaded genome package:

  • On Macintosh: Double-click on the ZIP file. The files will be automatically extracted via the Archive Utility.
  • On Windows 7 & Windows 8: Use any archive utility to extract the files. One method is to double-click on the ZIP file. In the ensuing Explorer window, click Extract all files from the top left. Choose a location for the files and select Extract.

See Automatically download and add genome template packages for instructions on adding the genome package to SeqMan NGen.

Notes about manually downloading & adding genome packages:

  • SeqMan NGen can read and produce output using a variety of common chromosome naming conventions, including “chr1” and “ch1,” as well as Arabic and Roman numerals. Chromosome names are captured from genome template packages and used to assign contig IDs to entries from BED, VCF and manifest files.
  • In the human genome template packages provided by DNASTAR, the "unlocated contig" is actually a concatenated, multi-sequence contig containing the alternate loci sequences. These loci are used for large regions where the human population contains variation so divergent that it cannot be adequately described by simple substitutions and small indels. Examples of these regions include the LRC/KIR complex on chromosome 19 and the MHC on chromosome 6.

Need more help with this?

Thanks for your feedback.