The topic you requested could not be found.
Related topics are listed below.

Input Reference Sequences

Create and Run an Assembly » Wizard screen descriptions » Input Reference Sequences

*Note: Three other screens similar to this one are described separately: Input Viral Genomes, Input Host Files, and Input Biome Genomes. Depending on the workflow, choosing a “reference” option in the Choose Assembly Type screen may cause the Input…

Input rRNA or Other Contaminant Sequences

Create and Run an Assembly » Wizard screen descriptions » Input rRNA or Other Contaminant Sequences

If you are following the transcript annotation workflow (de novo Transcriptome/RNA-Seq), you may enter the rRNA sequences here in order to screen out contaminants during assembly. *Note: It is not necessary to enter anything in this screen; if desired, you may…

Annotate reference sequences prior to import

Prepare Input Files » Annotate reference sequences prior to import

Using annotated reference sequences in SeqMan NGen may enable you to better analyze the identified putative SNPs when viewing your assembled project in SeqMan Pro. If desired, annotate your reference sequence in SeqBuilder Pro (the Lasergene application for sequence…

Prepare Input Files

Prepare Input Files

Before beginning a SeqMan NGen project, you may wish to prepare certain types of input files in advance: Prepare paired-end data for Illumina or Sanger Annotate reference sequences prior to import Make a custom VCF or BED file or troubleshoot a Manifest…

Input Biome Genomes

Create and Run an Assembly » Wizard screen descriptions » Input Biome Genomes

If you are following the Metagenomics/Population Assembly workflow and selected either type of “reference based assembly” in the Choose Assembly Type screen, the Input Biome Genomes screen will appear.You must input the specified sequence(s) before…

Input Viral Genomes

Create and Run an Assembly » Wizard screen descriptions » Input Viral Genomes

If you are following the Viral-Host Integration workflow, the Input Viral Genomes screen will appear as one of the wizard screens. If so, you must input the specified sequence(s) before proceeding further in the wizard. See our Supported File Types page for…

Input Host Files

Create and Run an Assembly » Wizard screen descriptions » Input Host Files

If you are following the Viral-Host Integration workflow or if you are following the Metagenomics/Population Assembly workflow and selected either “host removal” option in the Choose Assembly Type screen, the Input Host Files screen will appear.You must…

Reference-guided workflow output

Appendix » Access and understand output files » Reference-guided workflow output

Reference-guided workflows vary in the number and contents of output files and folders. Only a subset of items in the table below may appear for a particular workflow. In the file names below, the project name should be understood to precede any hyphen (-) or…

Whole genome reference-guided workflow

Workflow Types » Whole genome reference-guided workflow

To follow the whole genome reference-guided workflow, also called the variant analysis workflow, make the following selections in the SeqMan NGen wizard: In the Choose Assembly Workflow screen, select Whole genome. In the Choose Assembly Type screen, choose…

RNA-Seq reference-guided workflow

Workflow Types » RNA-Seq reference-guided workflow

      Did you arrive here by selecting the   DNASTAR Navigator workflow Transcriptomics > RNA-seq? If so, you’re in the right place!       The RNA-Seq reference-guided workflow is specified by choosing…

Assembly Report contents for reference-guided workflows

View the Assembly Report » Assembly Report contents for reference-guided workflows

The Assembly Report for reference-guided assemblies will contain a subset of the following results: Run Statistics Reference Seq Cnt The total number of sequences in the reference (template). Sequence Cnt The total number of reads…

Assembly Output (de novo, special reference-guided)

Create and Run an Assembly » Wizard screen descriptions » Assembly Output » Assembly Output (de novo, special reference-guided)

You must select a name and location for your project in the Assembly Output dialog before proceeding further in the wizard. The following version of the dialog is shown only when you are following the de novo or special reference-guided workflows. *Note: For other…

RNA-Seq reference-guided workflow output

Appendix » Access and understand output files » RNA-Seq reference-guided workflow output

If you are following the RNA-Seq reference-guided workflow (i.e., if you chose Transcriptome/RNA-Seq Assembly in the Choose Assembly Workflow screen, and Reference-guided assembly in the Choose Assembly Type screen), output results are saved in an .assembly package…

Input Assemblies and Define Individual Replicates

Create and Run an Assembly » Wizard screen descriptions » Input Assemblies and Define Individual Replicates

If you are following a Combine and/or Reanalyze Assemblies workflow on a local machine, the Input Assemblies and Define Individual Replicates screen is where you add, name and group the assemblies, and select a type for each assembly (e.g., RNA-Seq, CNV, etc.). The…

Use RNA-Seq reference-guided workflow options

Create and Run an Assembly » Wizard screen descriptions » Input Reference Sequences » Use RNA-Seq reference-guided workflow options

In the Input Reference Sequences screen, the following options are available for the RNA-seq reference-guided workflow only. The Add .Transcriptome button is only available in this workflow, and is used to add .transcriptome packages. Before (and usually…

Add and remove files (reference, host, viral or biome genome)

Create and Run an Assembly » Wizard screen descriptions » Input Reference Sequences » Add and remove files (reference, host, viral or biome genome)

The Input Reference Sequences, Input Viral Genomes, Input Host Files and Input Biome Genomes screens allow you to add sequence files, feature files and/or completed assemblies. To add files or folders of files: Local assemblies – Add files using the Add…

Assembly Options (special reference-guided, most de novo)

Create and Run an Assembly » Wizard screen descriptions » Assembly Options » Assembly Options (special reference-guided, most de novo)

The Assembly Options screen allows you to specify the parameters to use for your assembly. If you are following most de novo or special reference-guided workflows, you will see the following version of the dialog. Depending on your workflow, some of the options…

Cloud Tutorial 2: Whole genome reference-guided workflow (Salmonella)

SeqMan NGen Tutorials » Cloud Tutorial 2: Whole genome reference-guided workflow (Salmonella)

This tutorial is written for those with a Cloud Assembly license, whether purchased or from a free trial. Samples consist of four strains of Salmonella bacteria which are assembled against DNASTAR’s Salmonella genome template package. You don’t need to…

Use RNA-Seq de novo transcriptome output as a reference

Create and Run an Assembly » Wizard screen descriptions » Input Reference Sequences » Use RNA-Seq de novo transcriptome output as a reference

The topic RNA-Seq de novo transcriptome workflow provides a list of steps for generating output in the form of contigs. You may use these contigs as reference sequences in the RNA-Seq reference-guided workflow; for example, to quantify the relative abundances of…

Whole genome reference-guided workflow with gap closure

Workflow Types » Whole genome reference-guided workflow with gap closure

      Did you arrive here by selecting the   DNASTAR Navigator workflow Genomics > Hybrid reference-guided/de novo genome assembly? If so, you’re in the right place!       Reference-guided assembly with…

Input Sequence Files and Define Experiments or Individual Replicates

Create and Run an Assembly » Wizard screen descriptions » Input Sequence Files and Define Experiments or Individual Replicates

Most workflows will include some version of the Input Sequence Files and Define Experiments or Individual Replicates screen. The upper part of the dialog consists of a drop-down menu for specifying the read technology of your sequence files. Other options in this…

Cloud Tutorial 1: Whole genome reference-guided workflow (Arabidopsis)

SeqMan NGen Tutorials » Cloud Tutorial 1: Whole genome reference-guided workflow (Arabidopsis)

This tutorial is written for those with a Cloud Assembly license, whether purchased or from a free trial. Samples consist of three mutant Arabidopsis thaliana genomes which are assembled against DNASTAR’s Arabidopsis genome template package. (Arabidopsis thaliana…

Cloud Tutorial 4: RNA-Seq reference guided workflow (E. coli)

SeqMan NGen Tutorials » Cloud Tutorial 4: RNA-Seq reference guided workflow (E. coli)

This tutorial is written for those with a Cloud Assembly license, whether purchased or from a free trial. The tutorial shows how to quantify differential gene expression by comparing two experimental E. coli bacteria samples and two wild-type samples to the E. coli K12…

Part A: Setting up an RNA-Seq reference-guided assembly in SeqMan NGen

SeqMan NGen Tutorials » Tutorial 3: RNA-Seq reference-guided workflow with analysis in ArrayStar » Part A: Setting up an RNA-Seq reference-guided assembly in SeqMan NGen

In this first part of Tutorial 3, you will learn how to set up the project in SeqMan NGen and run the assembly. To perform the assembly in Part A, you must download 4 GB of data and then wait about 1-3 hours for the assembly. A much shorter option is to simply read…

Default wizard settings for de novo and special reference-guided assemblies

Appendix » Running SeqMan NGen through the command line » Default wizard settings for de novo and special reference-guided assemblies

Save Project As SeqMan Format 10M read limit Editable Save Unassembled Reads False Save Contigs To FASTA False (Special Reference-Guided only) True (all others) Save Report True Parameters…

Part A: Setting up and running a reference-guided assembly in SeqMan NGen

SeqMan NGen Tutorials » Tutorial 1: Whole genome reference-guided workflow with analysis in SeqMan Pro » Part A: Setting up and running a reference-guided assembly in SeqMan NGen

In this first part of Tutorial 1, you will set up and run a reference-guided assembly using the SeqMan NGen wizard. Click to download the data folder data folder T1_T2_Whole_Genome.zip. Then extract the contents to any convenient location (e.g., your…

Part B: Finding a putative duplication in the reference sequence using ArrayStar

SeqMan NGen Tutorials » Tutorial 6: Copy number variation (CNV) workflow with analysis in ArrayStar and GenVision Pro » Part B: Finding a putative duplication in the reference sequence using ArrayStar

In Part A of this tutorial, you ran an assembly and launched the results in ArrayStar. In this part, you will use the ArrayStar Gene Table to locate potential duplications in the reference sequence. Imagine that you would like to find a region that is repeated in the…

Tutorial 1: Whole genome reference-guided workflow with analysis in SeqMan Pro

SeqMan NGen Tutorials » Tutorial 1: Whole genome reference-guided workflow with analysis in SeqMan Pro

In this tutorial, you will create a reference-guided assembly using SeqMan NGen and then analyze the results using SeqMan Pro. The time required for the assembly component is approximately 2-5 minutes. Start with Part A: Setting up and running a reference-guided…

Tutorial 3: RNA-Seq reference-guided workflow with analysis in ArrayStar

SeqMan NGen Tutorials » Tutorial 3: RNA-Seq reference-guided workflow with analysis in ArrayStar

RNA-Seq uses next-gen sequencing to show the presence and quantity of RNA in a genome at a particular moment. DNASTAR’s SeqMan NGen application is the starting point for both reference-guided and de novo RNA-Seq workflows. Because this tutorial involves a…

Manually specify an isoform

Prepare Input Files » Manually specify an isoform

By default, SeqMan NGen chooses the longest CDS as the isoform for SNP calling. If desired, you may override the automated choice by specifying the preferred isoform manually in the reference sequence. To do so, follow these steps prior to importing the reference…

Wizard screen descriptions

Create and Run an Assembly » Wizard screen descriptions

Each of the SeqMan NGen wizard screens is described in a separate topic. Each workflow will features only a subset of all the available wizard screens. Begin Project Cloud Assembly Choose Assembly Workflow Choose Assembly Type Input Reference Sequences Input…

Metagenomics/population assembly workflow

Workflow Types » Metagenomics/population assembly workflow

      Did you arrive here by selecting the   DNASTAR Navigator workflow Genomics > Metagenomic and heterogenous samples? If so, you’re in the right place!       The Metagenomics/population assembly workflow…

Exome and gene panel workflow

Workflow Types » Exome and gene panel workflow

      Did you arrive here by selecting the   DNASTAR Navigator workflow Genomics > Resequencing and genotyping? If so, you’re in the right place!       For exome, gene panel, cancer, or genome-wide…

assemble

Appendix » Running SeqMan NGen through the command line » SNG commands » Preprocessing and assembling commands » assemble

The assemble command is required and reprocesses and assembles the sequences that have been loaded. Preprocessing may include quality trimming, and scanning for vector, repetitive, and contaminant sequences. Parameter Description Allowed values…

removeSmallContigs

Appendix » Running SeqMan NGen through the command line » SNG commands » Preprocessing and assembling commands » removeSmallContigs

The removeSmallContigs command disassembles any contigs without reference sequences that have fewer than the specified number of sequences. Parameter Description Allowed values minLength Specifies the minimum length of a contig to prevent…

Automatically download and add genomes from NCBI

Create and Run an Assembly » Wizard screen descriptions » Input Reference Sequences » Automatically download and add genomes from NCBI

The Input Reference Sequences and Input Biome Genomes screens allow you to download and/or add genomes directly from the NCBI database in either GenBank or FASTA formats. This option is only available for local assemblies. In the Input Reference/Host Files or…

Viral-host integration workflow

Workflow Types » Viral-host integration workflow

Viral-Host Integration is a special type of assembly used to locate putative viral insertion sites. It can also be used to predict the location of other inserted sequences, such as transposable elements. To follow this workflow, select Viral-Host Integration…

Part A: Creating the ChIP-Seq assembly in SeqMan NGen

SeqMan NGen Tutorials » Tutorial 8: ChIP-Seq workflow with analysis in ArrayStar » Part A: Creating the ChIP-Seq assembly in SeqMan NGen

In this part of Tutorial 8, you will set up and run a reference-guided assembly using the SeqMan NGen wizard. Click to download the data folder data folder T8_ChIP-Seq.zip. Then extract the contents to any convenient location (e.g., your computer’s desktop).…

RNA-Seq de novo transcriptome workflow

Workflow Types » RNA-Seq de novo transcriptome workflow

      Did you arrive here by selecting the   DNASTAR Navigator workflow Transcriptomics > De novo transcriptome assembly and annotation? If so, you’re in the right place!       The RNA-Seq de novo…

loadTemplate

Appendix » Running SeqMan NGen through the command line » SNG commands » File loading commands » loadTemplate

The loadTemplate command loads a sequence file to be used as a reference for all other sequences to be assembled to. The sequence will be displayed as a “reference” sequence in SeqMan Pro for SNP analysis. Parameter Descriptions Allowed values…

Part A: Setting up the CNV project in SeqMan NGen

SeqMan NGen Tutorials » Tutorial 6: Copy number variation (CNV) workflow with analysis in ArrayStar and GenVision Pro » Part A: Setting up the CNV project in SeqMan NGen

In this first part of Tutorial 6, you will use the SeqMan NGen wizard to import data and run the assembly. You will then press a button to open the results in ArrayStar. Click to download the data folder data folder T6_CNV.zip. Then extract the contents to any…

Add and remove accessory files (.vcf, .bed, etc.)

Create and Run an Assembly » Wizard screen descriptions » Input Reference Sequences » Add and remove accessory files (.vcf, .bed, etc.)

In the Input Reference Sequences screen, certain workflows offer some subset of the following options. Include alternative assembly templates – (only available if a genome package was selected) Check the box to include an alternate sequence representation…

Manually download and extract a genome template package

Prepare Input Files » Manually download and extract a genome template package

*Note: This topic applies only to local assemblies. It is never necessary to manually download a genome package when running a Cloud Assembly. The Input Reference Sequences, Input Viral Genomes, Input Host Files, and Input Biome Genomes screens allow you to specify…

Part A: Creating the validation assembly in SeqMan NGen

SeqMan NGen Tutorials » Tutorial 5: Validation control accuracy workflow with analysis in ArrayStar » Part A: Creating the validation assembly in SeqMan NGen

This first part of Tutorial 5 begins with a Validation Control Accuracy assembly in SeqMan NGen. Click to download the data folder data folder T5_Validation_Control.zip. Then extract the contents to any convenient location (e.g., your computer’s…

Validation control accuracy workflow

Workflow Types » Validation control accuracy workflow

To perform a validation control accuracy test, create a SeqMan NGen assembly project with the following settings: Begin Project: Assemble on local computer or Assemble on the DNASTAR cloud. Choose Assembly Workflow: Exome and Gene Panel. Choose Assembly…

Equivalence between wizard settings and SNG scripting commands

Appendix » Running SeqMan NGen through the command line » Equivalence between wizard settings and SNG scripting commands

The first two columns of the table below show the applicable SeqMan NGen wizard screen and setting name. The third column shows the equivalent scripting parameter. Each parameter is described in detail in SNG commands. Wizard Screen Wizard Setting…

Read Options

Create and Run an Assembly » Wizard screen descriptions » Read Options

The Read Options dialog displays the parameters used for running pre-assembly scans and allows you to adjust their values. This dialog is only available for de novo and special reference-guided workflows. Maximum total reads – The default value for…

Assembly and Signal Processing

Create and Run an Assembly » Wizard screen descriptions » Assembly Options » Assembly and Signal Processing

The Assembly and Signal Processing dialog is a specialized version of the Assembly Options screen. It is used in ChIP-Seq workflows or when merging multiple assemblies together. Import Variant Annotation Database (human build 37 and 38 only) – Check this…

Part A: Setting up and running a de novo assembly in SeqMan NGen

SeqMan NGen Tutorials » Tutorial 2: Whole genome de novo workflow with analysis in SeqMan Pro » Part A: Setting up and running a de novo assembly in SeqMan NGen

In this part of Tutorial 2, you will set up and run a reference-guided assembly using the SeqMan NGen wizard. The data used in this tutorial are 100bp paired-end Illumina sequences from E. coli. Sequences were trimmed from 1000x coverage to 32x coverage, allowing…

Sanger validation workflow

Workflow Types » Sanger validation workflow

The Sanger Validation workflow allows you to co-assemble non-Sanger and Sanger data in SeqMan NGen, and then view the results in SeqMan Pro. This workflow will save data in .sqd format as long as there are fewer than 10M reads. There are two circumstances…

Part A: Setting up and running the assembly in SeqMan NGen

SeqMan NGen Tutorials » Tutorial 7: Sanger validation workflow with analysis in SeqMan Pro » Part A: Setting up and running the assembly in SeqMan NGen

In this first part of Tutorial 7, you will set up and run the Sanger validation assembly in SeqMan NGen. Click to download the data folder data folder T7_Sanger_Validation.zip. Then extract the contents to any convenient location (e.g., your computer’s…

Add and remove sequence read files

Create and Run an Assembly » Wizard screen descriptions » Input Sequence Files and Define Experiments or Individual Replicates » Add and remove sequence read files

!IMPORTANT: If you are following the Sanger Validation workflow, you will see a slightly different version of this dialog, with separate areas for adding Sanger and non-Sanger data. Refer to the topic Sanger validation workflow before adding your files. Specifying…

Advanced Assembly Options: Alignment tab

Create and Run an Assembly » Wizard screen descriptions » Assembly Options » Advanced Assembly Options » Advanced Assembly Options: Alignment tab

Clicking the Advanced (Assembly) Options button from certain versions of the Assembly Options dialog launches a multi-tabbed Advanced Assembly Options dialog. This help topic describes options available in the Alignment tab. Default parameters in this tab are…

Tutorial 4: RNA-Seq de novo transcriptome workflow with analysis in SeqMan Pro

SeqMan NGen Tutorials » Tutorial 4: RNA-Seq de novo transcriptome workflow with analysis in SeqMan Pro

In this tutorial, you will de novo assemble an abbreviated set of paired end RNA-Seq sequences from Saccharomyces cerevisiae (yeast) from Nookaew I et al., 2012. This workflow uses an abbreviated yeast data set with about 1 million reads per file. With other…

Prepare paired-end data

Prepare Input Files » Prepare paired-end data

*Note: The following information does not apply to any workflows where Reference-based assemblies – normal workflow was selected in the Choose Assembly Type screen. Paired end reads are typically in two files with the forward reads in one file and the reverse…

Cloud Tutorial 5: RNA-Seq de novo workflow (Brassica)

SeqMan NGen Tutorials » Cloud Tutorial 5: RNA-Seq de novo workflow (Brassica)

This tutorial is written for those with a Cloud Assembly license, whether purchased or from a free trial. The sample is a single file of Brassica napus transcriptome data. (Brassica napus is known as “rapeseed,” and is the third largest source of cooking…

Choose Assembly Type

Create and Run an Assembly » Wizard screen descriptions » Choose Assembly Type

The Choose Assembly Type dialog allows you to choose between several types of templated and/or de novo assemblies, along with an accuracy test for variant calls. The image below shows only one version of this screen. Depending on your selection in the Choose Assembly…

Cloud Tutorial 6: Metagenomics workflow (human microbiome)

SeqMan NGen Tutorials » Cloud Tutorial 6: Metagenomics workflow (human microbiome)

This tutorial is written for those with a Cloud Assembly license, whether purchased or from a free trial. In this tutorial, you will assemble two data files with human microbiome samples against DNASTAR’s Microbial genome database, which consists of over 3,500…

2. Evaluating variants in the Alignment and Strategy views

SeqMan NGen Tutorials » Tutorial 1: Whole genome reference-guided workflow with analysis in SeqMan Pro » Part B: Analyzing assembly results in SeqMan Pro » 2. Evaluating variants in the Alignment and Strategy views

Now that you have completed Part B, step 1, this section will show how to review the putative non-synonymous variants. Click on the Contig Pos column header to sort variants in ascending order of their positions in the alignment. Look at the Impact column…

Set up experiments for multi-sample data

Create and Run an Assembly » Wizard screen descriptions » Input Sequence Files and Define Experiments or Individual Replicates » Set up experiments for multi-sample data

If you are using multi-sample data, check the Multi-sample data box, located in the Input Sequence Files and Define Experiments or Individual Replicates screen. The Multi-sample data box is checked by default for Cloud assemblies; otherwise, it is…

setParam

Appendix » Running SeqMan NGen through the command line » SNG commands » Parameter settings commands » setParam

The setParam command allows you to adjust the stringency of one or more of the assembling parameters for the project. SeqMan NGen will use the default values for any parameter that is not specified within the script. Parameter Description Allowed values…

assembleTemplate

Appendix » Running SeqMan NGen through the command line » XNG commands » assembleTemplate

*Note: All parameters are assumed to be optional unless the description is prefaced by “required.” assembleTemplate is a required command, and Initiates the assembly of the loaded sequences using the specified template as a reference. Example: XNG script used…

Advanced Assembly Options (untabbed)

Create and Run an Assembly » Wizard screen descriptions » Assembly Options » Advanced Assembly Options » Advanced Assembly Options (untabbed)

In a de novo or special reference-guided workflow, you can open the Advanced Assembly Options dialog by pressing the Advanced Assembly Options button from the Assembly Options dialog. Default parameters vary according to the sequencing technology and project…

Part B: Confirming non-Sanger reads in SeqMan Pro

SeqMan NGen Tutorials » Tutorial 7: Sanger validation workflow with analysis in SeqMan Pro » Part B: Confirming non-Sanger reads in SeqMan Pro

In Part A, you used SeqMan NGen to perform the Sanger validation assembly. In this part, you will use SeqMan Pro to visualize the results and to validate the Illumina data using the Sanger trace data. If you came to this topic directly from Part A, the file Sanger…

writeUnassembledSeqs

Appendix » Running SeqMan NGen through the command line » SNG commands » Project management commands » writeUnassembledSeqs

The writeUnassembledSeqs command saves all sequences that were not assembled in the project as .fas and .qual files. Parameter Description Allowed values file (required) Specifies the directory and file name of the unassembled sequences…

View the Assembly Report

View the Assembly Report

The Assembly Report summarizes the assembly statistics, including the parameters used, the number of assembled/unassembled sequences and contigs in your project, and the average quality scores. To view the report from within the SeqMan NGen wizard: Cloud users:…

saveProject

Appendix » Running SeqMan NGen through the command line » SNG commands » Project management commands » saveProject

The saveProject command saves the assembly to a project file. By default, the SeqMan Pro project file format (.sqd) is used. Phrap (.ace) and FASTA (.fas) formats may also be specified by using the format parameter, and specifying the desired file extension using the…

loadSeq

Appendix » Running SeqMan NGen through the command line » SNG commands » File loading commands » loadSeq

The loadSeq command loads a sequence file or files for assembly. See our website for a list of supported file types. Parameter Description Allowed values blockContig Used in the reference-guided workflow. [text string]…

Part A: Setting up and running an RNA-Seq de novo transcriptome assembly in SeqMan NGen

SeqMan NGen Tutorials » Tutorial 4: RNA-Seq de novo transcriptome workflow with analysis in SeqMan Pro » Part A: Setting up and running an RNA-Seq de novo transcriptome assembly in SeqMan NGen

In this part of Tutorial 4, you will use SeqMan NGen to de novo assemble and annotate the RNA-Seq data. Click to download the data folder data folder T4_De_Novo_RNA-Seq.zip. Then extract the contents to any convenient location (e.g., your computer’s…

Assembly Report contents for de novo workflows

View the Assembly Report » Assembly Report contents for de novo workflows

The Assembly Report for de novo assemblies will contain a subset of the following results: Assembly Totals Contigs Total number of contigs assembled. Contigs > 2K Total number of assembled contigs that are more than 2000 base pairs…

RNA-Seq de novo transcriptome workflow output

Appendix » Access and understand output files » RNA-Seq de novo transcriptome workflow output

If you are following the RNA-Seq de novo transcriptome workflow (i.e., if you chose Transcriptome/RNA-Seq Assembly in the Choose Assembly Workflow screen, and de novo assembly in the Choose Assembly Type screen), output results are saved in a folder called…

Assembly Options (all others)

Create and Run an Assembly » Wizard screen descriptions » Assembly Options » Assembly Options (all others)

The Assembly Options screen allows you to specify the parameters to use for your assembly. If you are following the transcript annotation workflow, or any workflow other than de novo special reference-guided, reference-guided miRNA or ChIP-Seq or Combined…

Create an assembly to use in the “SNP to Structure” workflow

Workflow Types » Whole genome reference-guided workflow » Create an assembly to use in the “SNP to Structure” workflow

If you are working with reference-guided human assemblies, Lasergene’s “SNP to Structure” workflow lets you combine genomic sequencing and variant level data with structure files from the RCSB Protein Data Bank (PDB) to model point mutations on the protein…

Specify stranded RNA-Seq reads

Create and Run an Assembly » Wizard screen descriptions » Input Sequence Files and Define Experiments or Individual Replicates » Specify stranded RNA-Seq reads

If you are following the RNA-Seq reference-guided workflow , an additional checkbox appears in the Input Sequence Files and Define Experiments or Individual Replicates screen: Stranded RNA-seq data. The box is unchecked by default. As background, some library…

loadRepeat

Appendix » Running SeqMan NGen through the command line » SNG commands » File loading commands » loadRepeat

The loadRepeat command loads a sequence file to be used to identify repeat sequences in the assembly. All sequences identified as repeats will be added to the assembly last, after all non-repeats have been assembled. See our website for a list of supported file…

ChIP-Seq workflow

Workflow Types » ChIP-Seq workflow

      Did you arrive here by selecting the   DNASTAR Navigator workflow Transcriptomics > ChiP-seq? If so, you’re in the right place!       To follow the ChIP-Seq workflow, make the following selections in…

load454PairedEnd

Appendix » Running SeqMan NGen through the command line » SNG commands » File loading commands » load454PairedEnd

The load454PairedEnd command loads a file of Roche 454 sequences and checks for the presence of a linker defining the paired end sequences. If the linker is found, the linker is removed and the remaining portion is split into two sequences linked with a paired end…

Automatically download and add genome template packages

Create and Run an Assembly » Wizard screen descriptions » Input Reference Sequences » Automatically download and add genome template packages

Several SeqMan NGen “wizard screens allow you to specify DNASTAR genome template packages for common model organisms. Each template package contains template sequence, annotations, and database linking information. If you wish to use DNASTAR’s database…

Contents of the .assembly package

Appendix » Access and understand output files » Reference-guided workflow output » Contents of the .assembly package

The .assembly package is part of the output for XNG workflows. (The contents of the -noSplit.assembly package are similar to those of the .assembly package.) In the file names below, the project name should be understood to precede any hyphen (-) or period (.)…

appendToAssembly

Appendix » Running SeqMan NGen through the command line » SNG commands » Preprocessing and assembling commands » appendToAssembly

The appendToAssembly command is for the reference-guided workflow and is intended for internal use only.

loadContaminant

Appendix » Running SeqMan NGen through the command line » SNG commands » File loading commands » loadContaminant

The loadContaminant command loads a contaminant sequence file to be used to identify known contaminants, such as primers, in the assembly. Sequences that contain at least 12 matching 17-mers are flagged as contaminant sequences and will be removed from the assembly.…

setPairSpecifier

Appendix » Running SeqMan NGen through the command line » SNG commands » Preprocessing and assembling commands » setPairSpecifier

The setPairSpecifier command defines the paired end pair specifier for the paired Sanger and Illumina sequences in the assembly. This command must appear in the script before the assemble command, but after sequences have been loaded using the loadSeq command. For more…

computeSNP

Appendix » Running SeqMan NGen through the command line » XNG commands » computeSNP

Sets parameters for the SNP computation phase of the assembly. The command is designed for use with existing BAM files that have not been analyzed for SNPs, or to re-analyze an existing file with different parameters. Most of the parameters for computeSNP are…

Cloud vs. Local Assembly

Cloud vs. Local Assembly

Before starting an assembly, you will need to decide whether to run the assembly locally or as a DNASTAR Cloud Assembly. In local assembly, you set up and run the SeqMan NGen assembly on your desktop or laptop computer. You must wait for one assembly to finish…

Welcome to SeqMan NGen

Welcome to SeqMan NGen

Lasergene Genomics provides everything you need for assembly and analysis of genomic, metagenomic, exomes/gene panels and transcriptomic sequencing data, and supports all popular file formats. Most workflows will start with sequence assembly in SeqMan NGen. SeqMan…

Specify read technology

Create and Run an Assembly » Wizard screen descriptions » Input Sequence Files and Define Experiments or Individual Replicates » Specify read technology

Before leaving the Input Sequence Files and Define Experiments or Individual Replicates dialog and proceeding to the next screen, you must make a selection from the Read technology drop-down menu: Illumina, 454, Ion Torrent, Pac Bio, Sanger or Other. The default…

Assembly Output (all others)

Create and Run an Assembly » Wizard screen descriptions » Assembly Output » Assembly Output (all others)

You must select a name and location for your project in the Assembly Output dialog before proceeding further in the wizard. The dialog below is shown for all workflows other than the de novo or special reference-guided workflows. However, the “Additional files”…

saveReport

Appendix » Running SeqMan NGen through the command line » SNG commands » Project management commands » saveReport

The saveReport command exports a report as a text file that summarizes assembly statistics, including the parameters used, the number of assembled/unassembled sequences and contigs, average quality scores, and the number of sequences excluded from the assembly due to…

Calculation of “match percentage”

Appendix » SeqMan NGen calculations » Calculation of “match percentage”

By default, SeqMan NGen uses a local match percentage which requires that the match percentage threshold be met in each overlapping window of 50 bases. The size of this window can be adjusted by specifying a different value for the match window parameter. An example…

JASPAR (PWM)

Create and Run an Assembly » Wizard screen descriptions » Define Binding Proteins » JASPAR (PWM)

In the Define Binding Proteins screen, selecting JASPAR (PWM) from the Binding site type menu lets you use the JASPAR position weight matrix to locate binding sites for eukaryotic organisms. (For prokaryotic organisms, instead use the Transcription Factor…

splitTemplates

Appendix » Running SeqMan NGen through the command line » SNG commands » Preprocessing and assembling commands » splitTemplates

The splitTemplates command splits reference contigs into multiple contigs in areas where there is zero coverage. Split contigs will be grouped into scaffolds with a defined position to allow for easy sorting when the project is viewed in SeqMan Pro. Annotations on the…

Workflow Types

Workflow Types

Don’t know where to start? The following topics describe some of the popular workflows used in SeqMan NGen. Many of these topics include brief (2-3 minute) videos showing the workflow in action. DNA-Seq and genotyping: Whole genome reference-guided…

Assembly Output

Create and Run an Assembly » Wizard screen descriptions » Assembly Output

The Assembly Output dialog varies depending on the workflow you are following. See the following topics for descriptions of the two main versions: Assembly Options for de novo or special reference-guided workflows Assembly Options for all other…

Tutorial 6: Copy number variation (CNV) workflow with analysis in ArrayStar and GenVision Pro

SeqMan NGen Tutorials » Tutorial 6: Copy number variation (CNV) workflow with analysis in ArrayStar and GenVision Pro

Copy-number variation (CNV) is defined as genomic regions that have been repeated one or more times and these variations play an important role in normal genetic variation and in some diseases. DNASTAR’s CNV workflow is used to analyze genomic variation by…

splitLinkerReads

Appendix » Running SeqMan NGen through the command line » SNG commands » Preprocessing and assembling commands » splitLinkerReads

The splitLinkerReads command splits specified reads based on their match to given linker sequences. Reads that align to the linker and include the linker site (as specified by the linkerSite parameter or by the cloneSite option in an .fof file) will be split into two…

setContaminantParam

Appendix » Running SeqMan NGen through the command line » SNG commands » Parameter settings commands » setContaminantParam

The setContaminantParam command allows you to adjust the parameters used for scanning for contaminant sequences. In order to be applied, this command must appear in the script before the loadContaminant command, and the contamScan parameter for the assemble command…

Non-English keyboards

Appendix » Non-English keyboards

SeqMan NGen recognizes only standard English-keyboard characters as input. If you are using a non-English keyboard, we recommend that you switch to a “virtual” English keyboard. Click a link for instructions: Windows 7 & 8, Macintosh 10.11.…

Files and Folders dialogs

Create and Run an Assembly » Wizard screen descriptions » Read Options » Files and Folders dialogs

The Read Options screen allows you to access Vector, Contaminant, and Repeat Files and Folders dialogs via the three associated Add buttons. These nearly identical Files and Folders dialogs are used to add files for the functions of vector trimming, contaminant…

Choose Assembly Workflow

Create and Run an Assembly » Wizard screen descriptions » Choose Assembly Workflow

The Choose Assembly Workflow dialog lets you select from a variety of workflow types divided into categories. Once you make a selection, SeqMan NGen will populate the rest of the wizard with appropriate default parameters for your project. For example, your…

Access and understand output files

Appendix » Access and understand output files

The output file structure for a SeqMan NGen assembly varies depending upon the workflow. For a description of output files, see Reference-guided workflow output or De novo workflow output. Note that FASTQ files created with the SeqMan NGen wizard (common) will have a…

Part B: Analyzing assembly results in SeqMan Pro

SeqMan NGen Tutorials » Tutorial 1: Whole genome reference-guided workflow with analysis in SeqMan Pro » Part B: Analyzing assembly results in SeqMan Pro

In Part A of this tutorial, you set up and ran a reference-guided assembly in SeqMan NGen. In this part, you will use SeqMan Pro to visualize the finished SeqMan NGen assembly, and to analyze variants and structural variations. Start with 1. Viewing and optimizing…

Specify experimental controls

Create and Run an Assembly » Wizard screen descriptions » Input Sequence Files and Define Experiments or Individual Replicates » Specify experimental controls

The Set Up Experiments screen appears if you did either of the following: Checked the Samples have replicates box in the Input Sequence Files and Define Experiments or Individual Replicates screen. Are following the Whole Genome or Exome and Gene Panel workflows…

Contents of the -reports folder

Appendix » Access and understand output files » Reference-guided workflow output » Contents of the .assembly package » Contents of the -reports folder

The -reports folder is part of the XNG .assembly package. In the table below (and in the sentence above), it should be understood that the project name precedes any hyphen (-) or period (.) used at the beginning of file and folder names. File Suffix or…

setRepeatParam

Appendix » Running SeqMan NGen through the command line » SNG commands » Parameter settings commands » setRepeatParam

The setRepeatParam command allows you to adjust the parameters used for scanning for repetitive sequences. In order to be applied, this command must appear in the script before the loadRepeat command, and the repeatScan parameter for the assemble command must be set to…

XNG, SNG, and QNG assemblers

Appendix » Running SeqMan NGen through the command line » XNG, SNG, and QNG assemblers

SeqMan NGen uses three powerful assemblers: XNG, SNG and QNG. The XNG assembler: The XNG assembler (patent pending) is used for all reference-guided assemblies. This assembler features an algorithm for fast, accurate assembly of extremely large genomes and is…

splitMIDSeqs

Appendix » Running SeqMan NGen through the command line » SNG commands » Preprocessing and assembling commands » splitMIDSeqs

The splitMIDSeqs command is used to split 454 MID reads into individual files with one file per MID tag. Parameter Description Allowed values destination The location and filename for the output. [directory/filename enclosed in…

createGenomeTemplate

Appendix » Running SeqMan NGen through the command line » XNG commands » createGenomeTemplate

The command createGenomeTemplate is intended for internal use only. Parameter Description Allowed values (defaults in bold/underline) file Specifies the directory and file/folder of the input file. [directory/filename enclosed in…

Handling of repeats

Appendix » SeqMan NGen calculations » Handling of repeats

Repeat handling parameters compute a threshold for deciding the number of identical subsequences of bases (mers) used to indicate a putative repeat. Mers that are common to two or more fragment reads are aligned to determine the overall layout of reads. For additional…

Filter based on “P not Ref”

Workflow Types » Whole genome reference-guided workflow » Filter based on “P not Ref”

In workflows using a reference sequence (e.g., the Variant analysis workflow) “P not Ref” is the probability that the base does not match the reference. The P not Ref cutoff can be set using a variety of methods, including both “hard” and “soft”…

loadLayout

Appendix » Running SeqMan NGen through the command line » SNG commands » File loading commands » loadLayout

The loadLayout command loads a layout file to be used for an assembly. The format may be either a SOLiD General Feature Format file (.gff) or a File of Filenames file (.fof). When this command is used, SeqMan NGen still aligns each read from the file to the reference,…

SeqMan NGen Tutorials

SeqMan NGen Tutorials

The following tutorials cover several of the most popular workflows in SeqMan NGen. Each tutorial begins with setting up and running an assembly in SeqMan NGen, then proceeds to other Lasergene applications for downstream analysis. The upper part of this topic,…

3. Adding variants to a custom database

SeqMan NGen Tutorials » Tutorial 1: Whole genome reference-guided workflow with analysis in SeqMan Pro » Part B: Analyzing assembly results in SeqMan Pro » 3. Adding variants to a custom database

Now that you have finished Part B, step 2, you will use SeqMan Pro to create a custom database from variants in the Variant Report. This custom database can be used for subsequent assemblies with the same reference sequence, and can be updated at any time to reflect…

removeDuplicateSeqs

Appendix » Running SeqMan NGen through the command line » XNG commands » removeDuplicateSeqs

The removeDuplicateSeqs command is used to coalesce multiple identical reads at the same position into a single read, provided the reads match the reference exactly. If this feature is active, at the end of assembly, XNG will print the message: “Coalesced $lld…

realignContigs

Appendix » Running SeqMan NGen through the command line » SNG commands » Preprocessing and assembling commands » realignContigs

The realignContigs command causes SeqMan NGen to perform another pass through a reference-guided assembly once the initial assembly is complete, and realigns contigs as needed. (This step occurs automatically for de novo assemblies.) Using this command may improve the…

Default wizard settings for all other assembly types

Appendix » Running SeqMan NGen through the command line » Default wizard settings for all other assembly types

Parameters Illumina 454 Ion Torrent Pac Bio Sanger Other Set pair information, if paired 500 3000 user defined no pairs allowed 5000 5000 Assembly Options Mer size – …

1. Checking conflicts, depth of coverage and pair consistency

SeqMan NGen Tutorials » Tutorial 2: Whole genome de novo workflow with analysis in SeqMan Pro » Part B: Analyzing results in SeqMan Pro » 1. Checking conflicts, depth of coverage and pair consistency

This first section of Part B will show you how to evaluate one of the contigs assembled by SeqMan NGen. If you came to this topic directly from Part A, the file De novo assembly.sqd will already be open in SeqMan Pro. Use File > Save and save the project under the…

Part C: Confirming the duplication using GenVision Pro

SeqMan NGen Tutorials » Tutorial 6: Copy number variation (CNV) workflow with analysis in ArrayStar and GenVision Pro » Part C: Confirming the duplication using GenVision Pro

In Part B of this tutorial, you located a putative duplication in the reference sequence using ArrayStar. Deletions or duplications can be confirmed graphically by sending them to SeqMan Pro or GenVision Pro. In this section, you will use GenVision Pro to view a…

Illumina pairs

Prepare Input Files » Prepare paired-end data » Illumina pairs

Paired end reads are typically in two files, or a small number of files if they are from multiple runs or lanes. These pairs are specified by a naming convention used in the .fasta file comment line. For de novo assemblies with paired end reads, SeqMan NGen…

Tutorial 5: Validation control accuracy workflow with analysis in ArrayStar

SeqMan NGen Tutorials » Tutorial 5: Validation control accuracy workflow with analysis in ArrayStar

Validation Control Accuracy testing statistically measures the agreement between the called variants in the control sample assembly (“validation control”) against a curated answer set for that reference material (“standard”). . In this tutorial, you will use…

loadBAM

Appendix » Running SeqMan NGen through the command line » XNG commands » loadBAM

The command loadBAM is used to set parameters for analyzing existing BAM files. It allows ungapped BAM files to be converted into a fully gapped assembly file or to re-gap an existing file with different parameters. The command also permits SNPs to be calculated or…

Filter based on “P not Ref”

Appendix » Running SeqMan NGen through the command line » Filter based on “P not Ref”

In workflows using a reference sequence (e.g., the Variant analysis workflow) “P not Ref” is the probability that the base does not match the reference. The P not Ref cutoff can be set using a variety of methods, including both “hard” and “soft”…

convertReads

Appendix » Running SeqMan NGen through the command line » SNG commands » Preprocessing and assembling commands » convertReads

The convertReads command converts a sequence from one file format to another. This command is particularly useful for converting SOLiD .csfasta files into .fastq files that can be used by the XNG assembler. Parameter Description Allowed values…

trimVector

Appendix » Running SeqMan NGen through the command line » SNG commands » Preprocessing and assembling commands » trimVector

The trimVector command is used for fast trimming vector sequence. Each read file is processed and the trimmed file is saved to the destination folder. If the file with the same name exists, the number will appended to the file name. The file is saved in .fastq format,…

Peak detection methods

Create and Run an Assembly » Wizard screen descriptions » Assembly Options » Assembly Options (all others) » Peak detection methods

The Assembly and Signal Processing screen allow you to specify a peak detection method. Available methods are described in the table below: Name Description MACS The MACS Peak Finder is based on the peak detection algorithm (Zhang, et al.,…

Research references

Appendix » Research references

Benjamini Y and Hochberg Y (1995). “Controlling the false discovery rate: a practical and powerful approach to multiple testing.” Journal of the Royal Statistical Society. Series B (Methodological), Vol. 57, No. 1 (1995), pp. 289-300. Read online Fitzgerald…

Annotation Options

Create and Run an Assembly » Wizard screen descriptions » Transcript Annotation Database » Annotation Options

To open the Annotation Options dialog, press the Transcript Annotation Options button in the Transcript Annotation Database screen. The Annotation Options dialog can be used to change the default naming convention used for RefSeq packages, or to specify a custom GREP…

splitPairs

Appendix » Running SeqMan NGen through the command line » SNG commands » Preprocessing and assembling commands » splitPairs

The splitPairs command is used to split 454 or ion torrent mate pair files into forward and reverse (and singleton) files. Parameter Description Allowed value destination The location and filename for the output. [directory/filename…

Define Binding Proteins

Create and Run an Assembly » Wizard screen descriptions » Define Binding Proteins

This wizard screen only appears for reference-guided ChIP-Seq projects. The Define Binding Protein dialog allows you to define binding sites for your experiment. Choose a Binding site type from the drop-down menu: Type-in pattern – To…

setVectorParam

Appendix » Running SeqMan NGen through the command line » SNG commands » Parameter settings commands » setVectorParam

The setVectorParam command allows you to adjust the parameters used for vector trimming. In order to be applied, this command must appear in the script before the loadVector or TrimVector command, and the vectScan parameter for the assemble command must be set to…

Detection of structural variations

Appendix » SeqMan NGen calculations » Detection of structural variations

In addition to SNPs and small insertions and deletions, genetic variation can also involve large scale rearrangements. These rearrangements may include large insertions and deletions, inversions, and translocations — collectively known as structural variations…

Merge multiple assemblies together

Workflow Types » Merge multiple assemblies together

      Did you arrive here by selecting the   DNASTAR Navigator workflow Genomics (or Transcriptomics) > Combine assemblies into a single analysis project? If so, you’re in the right place!       Once you…

Remove PhiX control reads from Illumina data

Prepare Input Files » Remove PhiX control reads from Illumina data

During de novo assembly, contamination of Illumina data with PhiX control sequence may result in the generation of spurious contigs. For background information, please see Mukherjee et al. Standards in Genomic Sciences 2015, 10:18. Note that: Not all Illumina data…

dumpSNP

Appendix » Running SeqMan NGen through the command line » XNG commands » dumpSNP

The command dumpSNP is intended for internal use only, and creates a tab delimited text file from one or more SNP containing binary files generated during assembly. SNP binary files include those with the .snpExt suffix contained in an .assembly package as well as…

4. Discovering structural variations

SeqMan NGen Tutorials » Tutorial 1: Whole genome reference-guided workflow with analysis in SeqMan Pro » Part B: Analyzing assembly results in SeqMan Pro » 4. Discovering structural variations

As you saw in Part B, step 1, short indels were included in the Variants Report. However, any indels long enough to inhibit assembly are instead gathered in another report called the Structural Variation Report. Return to the Project Summary window (with…

Make a custom VCF file

Prepare Input Files » Make a custom VCF file

Variant Call Format (VCF) files have multiple uses. For instance, they can provide a way to flag previously known SNPs and to filter them in SNP tables. In DNASTAR’s SeqMan NGen, these SNPs are called "annotated SNPs"; in ArrayStar, they are referred to as "user…

Advanced Trim/Scan Options

Create and Run an Assembly » Wizard screen descriptions » Read Options » Advanced Trim/Scan Options

From the Read Options screen, clicking the Advanced Trim/Scan Options button brings you to the Advanced Trim/Scan Options dialog. This dialog allows you to view and modify trimming parameters and vector, repeat and contaminant scanning parameters. Quality end…

Advanced Assembly Options: Variant tab

Create and Run an Assembly » Wizard screen descriptions » Assembly Options » Advanced Assembly Options » Advanced Assembly Options: Variant tab

Clicking the Advanced (Assembly) Options button from certain versions of the Assembly Options dialog launches a multi-tabbed Advanced Assembly Options dialog. This help topic describes options available in the Variant tab. The Variant tab is used to view and edit…

1: Viewing and optimizing the Variant Report

SeqMan NGen Tutorials » Tutorial 1: Whole genome reference-guided workflow with analysis in SeqMan Pro » Part B: Analyzing assembly results in SeqMan Pro » 1: Viewing and optimizing the Variant Report

This section shows how to open and optimize SeqMan Pro’s Variant Report. If you came to this topic directly from Part A, the file Templated assembly.assembly will already be open. Otherwise, launch SeqMan Pro and use File > Open to open it. Because this…

Set up replicates and replicate sets

Create and Run an Assembly » Wizard screen descriptions » Input Sequence Files and Define Experiments or Individual Replicates » Set up replicates and replicate sets

Some workflows provide an option for setting up replicate sets. Replicates and replicate sets are specified in two sequential wizard screens. To specify replicates in the Input Sequence Files and Define Experiments or Individual Replicates screen: Check the Run…

Cloud Tutorial 3: Whole genome de novo workflow (Salmonella)

SeqMan NGen Tutorials » Cloud Tutorial 3: Whole genome de novo workflow (Salmonella)

This tutorial is written for those with a Cloud Assembly license, whether purchased or from a free trial. The sample file is the genome from a single strain of Salmonella bacteria. You don’t need to download any data to follow this tutorial, as it is provided in…