Several SeqMan NGen “wizard screens allow you to specify DNASTAR genome template packages for common model organisms. Each template package contains template sequence, annotations, and database linking information. If you wish to use DNASTAR’s database association features (e.g., dbSNP, GERP, and COSMIC), you must input one of these genome packages in the appropriate screen for your workflow.
If you use using Cloud Assemblies or have previously downloaded a genome package to your local computer, skip to step 5. Otherwise, start at step 1.
- If a Download Genome Packages button is not present, follow the procedure in Manually download and extract a genome package. Otherwise, press the button.
- Select a package from the list, and click Select.
- When prompted, choose a location in which to save the package.
- When the download finishes, click OK.
- Click Add Genome Package. If you are running a local assembly, navigate to the location where you saved the automatically downloaded (or manually downloaded & extracted) package, and click Open. If you are running a Cloud assembly, select a package from the list, and click Select.
The genome template package now appears in the large white box on the left of the dialog.
- The Add Genome Package and Download Genome Package buttons are disabled if you have already added files using the Add or Add Folder buttons.
- SeqMan NGen can read and produce output using a variety of common chromosome naming conventions, including “chr1” and “ch1,” as well as Arabic and Roman numerals. Chromosome names are captured from genome template packages and used to assign contig IDs to entries from BED and Manifest files. (Note that manifest files are typically used to represent coordinates of regions that were captured in procedures, such as exon capture performed prior to sequencing.)
- In the human genome template packages provided by DNASTAR, the “unlocated contig” is actually a concatenated, multi-sequence contig containing the alternate loci sequences. These loci are used for large regions where the human population contains variation so divergent that it cannot be adequately described by simple substitutions and small indels. Examples of these regions include the LRC/KIR complex on chromosome 19 and the MHC on chromosome 6.
After creating an assembly with a genome template, you can access dbSNP information as described in the following brief video:
Need more help with this?