Note: All commands and parameters are assumed to be optional unless the description is prefaced by “required.”
Command |
Parameter |
Description |
Allowed values (defaults in bold/underline) |
Wizard equivalent | ||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||
assembleTemplate
(required) Initiates the assembly of the loaded sequences using the specified template as a reference.
Example:
XNG script used in the “clustering” step of the transcript annotation workflow:
merSize: 25 minNewClusterSize: 5 minSingleMergeClusterSize: 7 minMultiMergeClusterSize: 7 minMultiMergeIgnoreFactor: currently not used by default minClusterSizeToOutput: 100 | ||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||
|
alignmentCutoff |
Used in the “clustering” step of the transcript annotation workflow. |
[number]
Default = 200 |
| ||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||
|
assemble |
Specifies whether to use the part of the query that matches the contaminant sequence(s), the part that doesn’t match, or both. |
[matchContam|noMatchContam|all] |
| ||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||
|
assemblyInfo |
Contains information about the assembly. |
[text string] |
| ||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||
|
assemblyInfoAlt |
Contains pairs of keys and values which will be written to the -0.assemblyInfo file. |
|
| ||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||
|
autoTrim |
Specifies whether mismatching ends of reads should automatically be trimmed. |
[true|false] |
| ||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||
|
autoTrim |
Specifies whether mismatching ends of reads should automatically be trimmed. |
[true|false] |
| ||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||
|
boneyardAssembly |
Specifies whether sequences not used in the original or incremental XNG assemblies should be added to the assembly project by the SNG assembler. This command pertains only to reference-guided assemblies with gap closure. By default, during this type of assembly, the XNG assembler first finds structural variations (SVs) then splits the contig after each SV. Elements of this process can be modified using this command. (Note: “Boneyard” is a term for sequences that were not assigned to any contig). |
[true|false] |
| ||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||
|
combineDuplicateSeqs |
Specifies whether the duplicate reads will be clustered. |
[true|false] |
| ||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||
|
contaminant |
Use of this parameter partitions the query data by running an additional mer-match (layout) against the specified contaminant sequence(s). A full assembly is then run using the part of the query that either matches or does not match the contaminant sequence(s). This parameter can be used for removing reads originating from an organism(s) that may have also been present in the query data set (e.g., reads from human DNA present in a metagenomic sample from the human gut).
file: [directory/filename enclosed in quotes] the file with contaminant sequences.
assembleContam: [matchContam|noMatchContam|all]
merLayoutMin: [number]
unassembled: [directory/filename enclosed in quotes] the file containing no contaminant reads. |
[directory/filename enclosed in quotes] |
| ||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||
|
dbSNPTable |
(Intended for internal use only). |
[directory/filename enclosed in quotes] |
| ||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||
|
delayAlignInserts |
Use of this flag turns the delay reads that cause inserts on or off. ‘True’ means that gap causing reads will be delayed. Reads will be added such that reads causing the lowest number of inserts (length of inserts is not considered) will be added before those causing more inserts. |
[true|false]
Defaults: true for named read technologies; false for ‘Other’ read technologies |
.
| ||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||
|
deleteIntermediates |
Specifies whether intermediate files are saved or deleted. These files can be large with large-scale projects. |
[true|false|none|all|notTemplateMer] |
| ||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||
|
directoryMer |
Specifies the path and directory where both the template and query data mer files will be stored. Alternatively, separate directories for the template and query mer files can be specified using the parameters below. If no directory is specified, the mer file will be created in the directory containing the sequence data. |
[directory/filename enclosed in quotes] |
| ||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||
|
directoryQueryMer |
(required) Specifies the path and directory where the query mer file will be stored. |
[directory/filename enclosed in quotes] |
| ||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||
|
directoryTemplateMer |
(required) Specifies the path and directory where the template mer file will be stored. |
[directory/filename enclosed in quotes] |
| ||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||
|
filterDeepLayout |
(optional) Specifies that XNG remove superfluous sequences in areas of deep coverage. Set to ‘false,’ by default, except for projects involving miRNA or microbial genomes, where it is set to ‘true.’ |
[true|false] |
‘true’ = the Limit all deep coverage regions radio button is selected in the Advanced Assembly Options > Alignment tab dialog | ||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||
|
filterDeepLayoutOrganelle |
(optional) Specifies that XNG remove superfluous sequences in areas of deep coverage. Set to ‘false,’ by default, except for projects involving a mitochondrial or chloroplast template (i.e., those with a short name of 'MT','M', or 'CHL' or 'chloro’), where it is set to ‘true.’ |
[true|false] |
‘true’ = the Only limit deep coverage regions for Mitochondria and Chloroplasts radio button is selected in the Advanced Assembly Options > Alignment tab dialog | ||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||
|
forceFullForwardAlign |
Start the alignment at the 5’ end of the sequence. |
[true|false] |
| ||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||
|
forceMake |
Specifies whether new intermediate mer files will be created. A value of false means that existing valid intermediate files will be used. |
[true|false|query|hit|layout] |
| ||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||
|
format |
Specifies the format of the alignment output file. If ‘none’ is entered, the assembly is run to include the alignment phase, but no alignment output is generated. This parameter can be used to remove reads from a contaminant source. |
[BAM|SQD|NONE|NONE_align|Aux_align] |
| ||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||
|
gap5Prime |
Put the gap on the 5’ side of the sequence. |
[true|false] |
| ||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||
|
gapPenalty |
The penalty for opening or extending a gap during an alignment. This penalty is deducted from the pairwise score used to calculate match percentage. A high gap penalty suppresses gapping, while a low value promotes gapping. |
[number]
Default = 30 for most workflows, 50 for the transcript annotation workflow |
| ||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||
|
gapExtensionPenalty |
Used in the “clustering” step of the transcript annotation workflow. |
[number]
Default = 5 |
| ||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||
|
geneticCode |
This parameter specifies the genetic code to use with a reference sequence. |
[filepath/standard Lasergene genetic code file name] |
| ||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||
|
hits |
(required) Specifies the path and name of the hit file. Incomplete paths will be appended to the default directory. |
[directory/filename enclosed in quotes] |
| ||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||
|
increaseRunGapPen |
This parameter is a flag to increase the gap open penalty in HP runs. |
[true|false] |
| ||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||
|
layout |
(required) Specifies the path and name of the layout file. Incomplete paths will be appended to the default directory. |
[directory/filename enclosed in quotes] |
| ||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||
|
layoutAlign |
Specifies that a pairwise alignment should be performed at the payout phase in order to pick the best position for a given read. |
[true|false] |
| ||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||
|
layoutMaxTemplateGap |
The maximal number of gaps introduced into the alignment used during layout. |
[number] |
| ||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||
|
layoutRSRange |
The maximal Register Shift difference used while building the layout. |
[number] |
| ||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||
|
layoutType |
Specifies how reads are to be laid out. |
[unique|once|multiple|multipleAll] |
| ||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||
|
matchScore |
The score for a base match during an alignment. This score contributes to the pairwise score used to calculate match percentage. Increasing the matchScore value allows for longer or more frequent gaps, thus forcing bases that match to be assembled together. |
[number]
Default = 10 |
| ||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||
|
MaxGap |
The maximum number of gaps allowed per 1000 bases in the alignment. |
[number from 0-1000]
Default = 6 for most workflows, 30 for the transcript annotation workflow |
| ||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||
|
maxMergeSize |
When linking clusters into a scaffold, only link them together if the overall number of reads in the scaffold would not exceed this threshold. Used in the “clustering” step of the transcript annotation workflow. |
|
| ||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||
|
maxNCnt |
optional) This parameter removes sequential reads of the IUPAC ambiguity code ‘N’ that are greater than or equal to the number specified. Use of this parameter may help in assemblies whose reads contain large clusters of spurious N’s. |
[integer] |
| ||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||
|
maxSecondaryTrimLength |
During alignment, a read can be trimmed from both ends. This parameter defines the longest allowable length for the smaller of the two trimmed ends. |
[number] |
| ||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||
|
maxSeqs |
Specifies the maximum number of query sequences to add to an assembly. Use of this command can speed up assembly. |
[number] |
| ||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||
|
merCntThresh |
Minimum number of mers needed in order to be recorded in the mer file. |
[number] |
| ||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||
|
merLayoutMin |
Specifies the minimum length (in bases) of at least one stretch of matching mers used to identify matches between the reference and query data. The minimum value is equal to the mer. The maximum value is the read length, which would require the entire read be an exact match. For example, with a merSize of 19 and a merLayoutMin of 21, at least one stretch of three consecutive mers in a read would have to match for the read in order to be included in the layout. |
[number from 11-1000]
Default = 25 |
| ||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||
|
merMinimizer |
(Intended for internal use only) |
[number] |
| ||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||
|
merSize, merLength or matchSize |
(required) Specifies the length (in bases) of mers used to identify matches between the reference and query data. |
[number] |
| ||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||
|
merSkip |
(Intended for internal use only) Specifies the number of positions to ignore or “skip” when creating the template mer file. Normally, mers are only skipped in the query (see merSkipQuery, below). The first and last mer of every read are always included. Increasing the value reduces the size of the intermediate files as well as the overall assembly time. However, larger values can also reduce the number of reads included in the assembly, especially with short read data.
0 = do not skip 2 = skip every second base 3 = skip every third base etc. |
[number]
Default = 0 |
| ||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||
|
merSkipQuery |
Specifies the number of positions to ignore or “skip” when creating the query mer file. The first and last mer of every read are always included. Increasing the value reduces the size of the intermediate files as well as the overall assembly time. However, larger values can also reduce the number of reads included in the assembly, especially with short read data.
0 = do not skip 2 = skip every second base 3 = skip every third base etc. |
[number]
Default = 0 |
| ||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||
|
method |
Defines how to handle splits in the assembly:
normal – normal assembly method splitOnly – only reads which have been split will be included in the assembly noSplit – no reads will be split |
[normal|splitOnly|noSplit] |
| ||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||
|
minAlignedLength |
Specifies the minimum number of bases that must align after trimming for a read to be included in the assembly. |
[number from 11-1000]
Default = 25 for most workflows, 50 for the transcript annotation workflow |
| ||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||
|
minClusterSizeToOutput |
Threshold for the number of reads that a cluster must contain in order for the cluster to be passed along to SNG for assembly in the next step of the program. Used in the “clustering” step of the transcript annotation workflow.
Note that this command is present only for the clusterParam block of the rnaAssemble command. |
[number] |
| ||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||
|
minMatchPercent |
The minimum percentage of matches in an overlap required to join two sequences in the same contig. |
[number]
Default = 93 for most workflows, 60 for the transcript annotation workflow |
| ||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||
|
minMultiMergeClusterSize |
When two or more clusters overlap the same k-mer, the minimum number of reads (depth) required at that k-mer for a cluster to consider that cluster significant.
If three or more clusters exceed this threshold, the k-mer is considered “noisy” and a potential false join, and will not be merged. This is reported as a “multi-cluster link that was not merged”.
If two significant clusters overlap and have similar enough depth, the clusters are considered linked and are scaffolded together. Otherwise, if only one cluster is significant, all reads at that k-mer which have no assigned cluster are merged directly into it as described for the minSingleMergeClusterSize option. This parameter is used in the “clustering” step of the transcript annotation workflow.
Note that this command is present only for the clusterParam block of the rnaAssemble command. |
[number] |
| ||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||
|
minMultiMergeIgnoreFactor |
When two or more clusters overlap the same k-mer and may be linked, they must be within this ratio of one other. Used in the “clustering” step of the transcript annotation workflow.
Note that this command is present only for the clusterParam block of the rnaAssemble command. |
[number] |
| ||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||
|
minSeqsPerTemplate |
Minimum number of sequences sufficient to build the layout or alignment. |
[number] |
| ||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||
|
minSingleMergeClusterSize |
The minimum number of reads (depth) matching an existing cluster at a single k-mer required to extend that cluster by immediately adding all new reads for that k-mer to the cluster. Used in the “clustering” step of the transcript annotation workflow.
Note that this command is present only for the clusterParam block of the rnaAssemble command. |
[number] |
| ||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||
|
minNewClusterSize |
Minimum number of matching reads at a single k-mer (i.e., “depth”) required to create a new cluster. Used in the “clustering” step of the transcript annotation workflow.
Note that this command is present only for the clusterParam block of the rnaAssemble command. |
[number] |
| ||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||
|
mismatchPenalty |
The penalty for a base mismatch during an alignment. This penalty is deducted from the pairwise score used to calculate match percentage. |
[number]
Default = 20 |
| ||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||
|
noSexChromosomes |
Disables special handling of sex chromosomes. |
[true|false] |
| ||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||
|
noSVPairSort |
Specifies whether to turn off the calculation of pairs for structural variations. This may potentially reduce XNG assembly time. |
[true|false] |
| ||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||
|
onePackage |
Specifies whether an assembly containing multiple reference sequences should be bundled into a single .assembly package. If ‘false’ is entered, one .assembly package is created per contig. |
[true|false] |
| ||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||
|
openInSeqman |
(optional) Specifies whether the completed assembly should immediately be launched in SeqMan. |
[true|false] |
| ||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||
|
output |
(required) Specifies the path and directory of the output files. Incomplete paths are appended to the default directory. |
[directory/filename enclosed in quotes] |
| ||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||
|
pairDist |
(Intended for internal use only) |
[true|false] |
| ||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||
|
pickTemplate |
Defines the number of templates from which to choose, and finds the template that is the best match for the input sequence. |
[number] |
| ||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||
|
placeHit |
(Intended for internal use only) |
[true|false] |
| ||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||
|
probe |
(Intended for internal use only) |
[number] |
| ||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||
|
query |
(required) Specifies the directory and file name(s) of the query data to be assembled. A folder with one or data files can also be used in place of individual file names.
Properties for query:
file: [directory/filename enclosed in quotes] Specifies the directory and file/folder.
isPair: [true|false] Specifies whether the query files contain paired end data.
minDist: [number] (required if isPair is ‘true’) Specifies the minimum expected distance in bases between paired end reads. Default is 0.
maxDist: [number] (required if isPair is ‘true’) Specifies the maximum expected distance in bases between paired end reads. Defaults are 750 for Illumina; 4500 for 454 and Sanger, 7500 for Other, and user-defined for Ion Torrent
seqTech: [unknown|IonTorrent||IlluminaLongReads|454|PacBio|normalScore|Other]
Specifies the offset to be used when converting compressed quality scores into numerical values. These are the offsets used for the technology specified:
Note 1: For 454,quality scores for homopolymeric runs of ≥ 2 are oriented from 5’ to 3’ on the top strand.
Note 2: If possible, the data type of unknown data is determined automatically based on the first data file.
pairTech : [unknown|LucigenRsaI|LucigenBfaI|Rsa1|Bfa1|Custom]
pairLinker: [string]
groupName: [string] The name of a group this file belongs to. Used for running multiple samples in one file.
sex: [unknown|female|male]
trim: [true|false] Specifies whether vector trimming needs to be applied to the reads.
sngTrim: contains parameters for fast vector trimming (See the SNG command TrimVector command)
scan: [true|false] Specifies whether reads needs to be scanned for contaminants
contaminantScan: Contains the assembleTemplate command with contaminant file used as a template and parameters: directoryTemplateMer, hits, layout, output, unassembled, results, format, mersize, ignorePolyMers and deleteIntermediates. The format parameter has valuenone_ALIGN.
Example:
query: {{file: “/data/home/proj/Illumina_s_5_1.txt”} {file: “/data/home/proj/Illumina_s_5_2.txt “} isPair: true minDist: 400 maxDist: 700 seqTech: Illumina} |
[directory/filename enclosed in quotes]
|
| ||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||
|
recordSplitsOnly |
Functional only when used in the same program as splitTemplateContigs or recordStructVariations (both described below). Specifies whether or not to turn off contig splitting while still recording SVs for later inclusion in the Structural Variation Report. |
[true|false] |
| ||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||
|
recordStructVariations |
Specifies under which circumstances structural variations (SVs) should be calculated and recorded.
0|false = Don’t calculate SVs 1|true = Calculate SVs at zero coverage 2 = Calculate SVs at insertions and deletions 3 = Calculate SVs at zero coverage and at insertions |
[integer between 0-3|true|false]
Default = 2 |
| ||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||
|
removeDuplicateSeqs |
Completely removes clonal reads after the alignment phase of assembly. Clonal reads, where the endpoints of both reads in a pair match those in another pair, are usually the result of PCR artifacts. If ‘true,’ the reads will not be scored, and will not be included in SNP calculations. Marking this parameter to ‘true’ may substantially increase the time needed for assembly. |
[true|false]
|
| ||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||
|
removeUniqueInserts |
Removes reads that cause an insert which no other read would create. This parameter is only enabled when delayAlignInserts (described under the assembleTemplate command) is true. |
[true|false]
Defaults: true for Illumina and Ion Torrent read technologies; false for all other types. |
| ||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||
|
repeatPenaltyScale |
Indicates the quality penalty (using the Phred scale) to use for a read which places in two locations identically. Higher repeat counts are further penalized relative to this on a log2 scale such that repeats placing in four locations have a double penalty, in eight locations have a triple penalty, and so on. This penalty is applied to a ceiling of Phred score 30 if the other methods are disabled or have a higher score. |
[number]
Default = 8 |
| ||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||
|
repeatThreshMax |
Specifies the maximum number of occurrences of a mer in the reference sequence(s) for it to be considered repeated. Mers exceeding this number will not be used for identifying matches. |
[number from 1-10000]
Default = 100 |
| ||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||
|
repeatThreshMin |
Specifies the minimum number of occurrences of a mer in the reference sequence(s) for it to be considered repeated. Mers less than this number will not be used for identifying matches. |
[number] |
| ||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||
|
reportFiles |
Defines the kind of report file to be generated.
perProject: [true|false] Generate a per project report.
perTemplate: [true|false] Generate a per template report.
removeInteral: [true|false] Remove intermediate reports. |
|
| ||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||
|
repeatmermax |
Threshold number of occurrences in a data set for a mer to be considered “repeated.” Used in the “clustering” step of the transcript annotation workflow. |
|
| ||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||
|
results |
Specifies the path and name of the result summary file. This file contains a compilation of assembly statistics and uses the extension fileSize.txt. Incomplete paths will be appended to the default directory. |
[directory/filename enclosed in quotes] |
| ||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||
|
saveUnSplitAssembly |
Specifies whether XNG should save both the normal assembly output, [filename].assembly, and the unsplit intermediate assembly, [filename]-noSplit.assembly. The latter file contains SVs but no SNPs, and can be used to validate splits in the final assembly. |
[true|false] |
| ||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||
|
sex |
Specifies the sex of the subject, used for read placement and SNP calling. See Handling of Sex Chromosomes for details. |
[male, female, unknown] |
| ||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||
|
showCDSVariant |
Specifies whether or not XNG should show all variants of a CDS feature contacted by a SNP. The version number for the CDS variant will then appear in brackets when viewed in the SNP report in SeqMan Pro. |
[true|false] |
| ||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||
|
sngConvertOptions |
(Intended for internal use only) |
[text string] |
| ||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||
|
snp |
Specifies whether or not a SNP detection pass of the gapped alignment should be made during the assembly. |
[true|false] |
| ||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||
|
snp_checkStrandedness |
Specifies whether or not the strand that each read comes from is considered in the SNP calculation. This is ignored by the simple SNP calling method (used when genome ploidy is “Heterogeneous”). |
[true|false] |
| ||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||
|
snp_combineSubs |
This parameter is used to coalesce adjacent substitutions. |
[true|false] |
| ||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||
|
snp_excludeBases3p |
(internal use only) This parameter causes the specified number of bases from the 3' end of each read to not be considered during variant calling. |
[integer] |
| ||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||
|
snp_excludeBases5p |
(internal use only) This parameter causes the specified number of bases from the 5' end of each read to not be considered during variant calling. |
[integer] |
| ||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||
|
snp_excludeBasesEdge |
This parameter causes the specified number of bases from both the 5' and 3' ends of each read to not be considered during variant calling. |
[integer]
For the simple SNP calling method (used when genome ploidy is “Heterogeneous”), the default is 5. For the Bayesian SNP calling methods (used when genome ploidy is “Diploid” or “Haploid”), the default is 0. |
| ||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||
|
snp_limitEndPos |
Specifies the 3' most coordinate of the specified template from which to stop calculating SNPs. |
[number between 1 and the length of the template] |
| ||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||
|
snp_limitStartPos |
Specifies the 5' most coordinate of the specified template from which to begin calculating SNPs. A value between 1 and the length of the template must be entered. |
[number]
Default = 1 |
| ||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||
|
snp_limitTemplateID |
Specifies a single template ID for which to calculate SNPs. |
[number]
Default = 0 |
| ||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||
|
snp_logEndPos |
Specifies the 3' most coordinate of the specified template from which to stop storing a detailed log of SNP information. A value between 1 and the length of the template must be entered. |
[number]
Default = 1 |
| ||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||
|
snp_logLevel |
Specifies the level of detailed logging to store in the “shared” project directory as “SNP.log.” Level 0 specifies that no log will be stored. Level 1 stores detailed info on the SNPs which were called, level 2 also logs columns where the preliminary filtered passed but the final filtering failed, and level 3 logs all columns. This is ignored by the simple SNP calling method (used when genome ploidy is “Heterogeneous”). |
[whole number from 0-3]
Default = 0 |
| ||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||
|
snp_logStartPos |
Specifies the 5' most coordinate of the specified template from which to begin storing a detailed log of SNP information. A value between 1 and the length of the template must be entered. |
[number]
Default = 1 |
| ||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||
|
snp_logTemplateID |
Specifies a single template from which to store a detailed log of SNP information. |
[number]
Default = 0 |
| ||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||
|
snp_maxRun |
Specifies the maximum length of a homopolymeric run for an indel to be considered during variant calling. For example, a snp_maxRun of '5' will allow a portion of sequence up to 5 bases in length to be called as a SNP. |
[integer]
Defaults are 3 for 454 and Ion Torrent read technologies; 5 for all others. |
| ||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||
|
snp_maxStrandBias
|
Strand Bias (SB) for a SNP is the bias for the SNP appearing on one strand versus the other. It is measured relative to the strand bias in the assembly at the location of the SNP. For example, in a column with 60 forward reads and 40 backward reads, 6 SNP bases on the forward strands, and 4 on the reverse strands would be unbiased. SB is given by the formula:
SB = |SNP% f – SNP% r | / Total SNP%
…where SNP% f and SNP% r are the percentage of reads containing the variant on the forward (top) and reverse (bottom) strands, respectively; and SNP% is the total percentage of reads containing the variant. SB is calculated based on an “absolute value,” and will therefore be a positive number.
The following table describes different SB thresholds:
Note: In cases where all the reads covering a base are on one strand only, the SNP% of the other strand cannot be calculated (due to a “division by zero” error). These positions will not be removed by the snp_maxStrandBias filter. To remove these variants, instead set snp_minStrandCov to ≥ 1.
Example:
In a homozygous case (SNP% = 100) with a depth of 100, where 75 variant containing reads are on the top strand (75%) and 25 variant containing reads are on the bottom strand (25%), the strand bias would equal: (75–25)/100 = 0.5. |
[integer]
Defaults for the Bayesian SNP calling methods (used when genome ploidy is “Diploid” or “Haploid”) are 0.8 for 454 and Ion Torrent read technologies; not shown (blank) for all others. Defaults for the simple SNP calling method (used when genome ploidy is “Heterogeneous”) are 0.25 for all read technologies. |
| ||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||
|
snp_minHomopolDelDepth |
Specifies the minimum read depth required to call a deletion in a homopolymeric run. |
[integer]
Default = 0 |
| ||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||
|
snp_minHomopolDelFrac |
Specifies the minimum fraction of reads required to call a deletion in a homopolymeric run. |
[integer]
Default = 0 |
| ||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||
|
snp_minHomopolInsDepth |
Specifies the minimum read depth required to call an insertion in a homopolymeric run. |
[integer]
Default = 0 |
| ||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||
|
snp_minHomopolInsFrac |
Specifies the minimum fraction of reads required to call an insertion in a homopolymeric run. |
[integer]
Default = 0 |
| ||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||
|
snp_minPctToScore |
Specifies minimum percentage of reads in a column which must differ from the reference in order to score the column. For the simple SNP calling method (used when genome ploidy is “Heterogeneous”), this is the only criteria used to call a SNP. For the Bayesian SNP calling methods (used when genome ploidy is “Diploid” or “Haploid”), this is a filter applied before the other parameters. |
[number from 0-1]
Default = 0.05 |
| ||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||
|
snp_minProbNonrefToCall |
Specifies the minimum probability of a SNP column which is required to call a SNP, expressed as a number from 0 and 1. The probabilities of all genotypes other than Homozygous Reference are totaled and checked against this number. This is the final filter applied during the Bayesian SNP calling methods (used when genome ploidy is “Diploid” or “Haploid”) and is ignored by the simple SNP calling method (used when genome ploidy is “Heterogeneous”). |
[number from 0-1]
Default = 0.1, requiring a minimum 10% change. |
| ||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||
|
snp_minStrandCov |
Specifies the minimum number of reads from each strand required to call a variant at a given position. |
[integer]
In the Bayesian SNP calling methods (used when genome ploidy is “Diploid” or “Haploid”), the default is 0. In the simple SNP calling method (used when genome ploidy is “Heterogeneous”), the default is 5. |
| ||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||
|
snp_minVariantDepthToScore |
(required if “snp” is true) Specifies the minimum depth required for a specific base (or deletion) in a column before it is considered usable for SNP calling. This is the second filter applied during the Bayesian SNP calling methods (used when genome ploidy is “Diploid” or “Haploid”) and is ignored by the simple SNP calling method (used when genome ploidy is “Heterogeneous”). |
[number from 0-100]
Default = 2 |
| ||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||
|
snp_minWeight |
Called “Minimum base quality score” in the SeqMan NGen wizard, this parameter specifies the minimum quality score for a base to be considered in the SNP calculation. |
[number]
In the simple SNP calling method (used when genome ploidy is “Heterogeneous”), the default is 20. In the Bayesian SNP calling methods (used when genome ploidy is “Diploid” or “Haploid”), the default is 5. |
| ||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||
|
snp_reportUserMissing |
Specifies what kind of positions to put in the missingUser file, including one or more of the following:
dbSNP = dbSNP Pos user = in user VCF SNP file zeroCoverage = include zero coverage regions cosmic = in COSMIC database allcaptured = include all positions in capture regions captured = include only positions in capture regions
Example:
snp_reportUserMissing: [user allcaptured captured] |
[kParamTypeStrFixedVocab] |
| ||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||
|
snp_runVar |
Uses a Bayesian probabilistic model to exclude heterozygous insertions and deletions in homopolymeric runs. Intended for use with Ion Torrent data. |
[true|false]
Defaults: true for 454 and Ion Torrent read technologies; false for all others. |
| ||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||
|
snp_showAllFeatures |
Specifies whether XNG should count SNPs multiple times if the SNP contacts different versions (variants) of a CDS feature. |
[true|false] |
| ||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||
|
snp_writeExtended |
Specifies whether the additional values produced by the Haploid or Diploid SNP calculation methods are included in the SNP table. |
[true|false] |
| ||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||
|
snpMethod |
Specifies the SNP detection method to use. Simple produces a count of each type of base in the column and calculates the percent of non-reference bases. Haploid uses a Bayesian statistical model to calculate a probability score that the position contains a polymorphism and give a quality score for the base called at that position. Diploid uses a Bayesian statistical model to calculate a probability score that the position contains a polymorphism and give a quality score for the base(s) called at that position. Based on the scores, it also calls the genotype at each position. |
[simple|haploid|diploid] |
| ||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||
|
splitTemplateContigs |
Specifies under which circumstances contigs should be cut after a templated assembly. Any split contigs will be grouped into scaffolds with a defined position to allow for easy sorting when the project is viewed in SeqMan Pro. This command pertains only to reference-guided assemblies with gap closure. By default, during this type of assembly, the XNG assembler first finds structural variations (SVs) then splits the contig after each SV. Elements of this process can be modified using this command.
0|false = Don't split 1|true = Split at locations with zero coverage 2 = Split at insertions and deletions 3 = Split at zero coverage and at insertions |
[integer between 0-3|true|false]
Default = 2 |
| ||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||
|
template |
(required) Specifies the directory and file name of the reference sequence file. A folder with one or more reference sequence files can also be used in place of individual file names. Each entry must also be enclosed by brackets. If more than template entry is used, the list must also be enclosed by an additional set of brackets.
Properties for template:
file: [directory/filename enclosed in quotes] Specifies the directory and file/folder.
feature: [directory/filename enclosed in quotes] optional) Specifies the directory and file name for annotated features when the reference sequence and feature annotations are in separate files.
transcriptKind: [both|identified|novel] if the .Transcriptome package is used as a template, defines which transcripts will be used as a template.
userSNP: [directory/filename enclosed in quotes] exomeCapture:
file: [directory/filename enclosed in quotes] The BED file name.
track: [string] the region of interest (Optional)
merMask: [true|false] Specifies if mers from outside of the capture region should be excluded from assembly.
Examples for template:
Sequence and annotation in one file:
AssembleTemplate template: {{file: “/data/home/proj/MG1655.gbk”} {file: “/data/home/proj/W3110.gbk”}}
Sequence and annotation in separate files:
AssembleTemplate template: {file: “/Library/ABC_proj/references/MG1655.fas” feature: “/Library/ABC_proj/references/MG1655.gff”} |
[directory/filename enclosed in quotes] |
| ||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||
|
templateHitCntThresh |
(Intended for internal use only) |
[number] |
| ||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||
|
trimToTargetRegions |
Controls whether reads are trimmed, by default, to the boundaries of the targeted regions, as defined by the .bed or manifest file. The default of ‘true’ indicates that the reads are trimmed to the stated boundaries. If conditions are not met, the SeqMan NGen wizard does not change this parameter to 'false,' but instead omits it from the script. The parameter status is only shown in the script for control workflows. |
[true|false] |
Advanced Options, Alignment tab: Trim to targeted regions | ||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||
|
unassembled |
|
[directory/filename enclosed in quotes] |
| ||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||
|
verify |
|
[true|false] |
| ||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||
computeSNP
Sets parameters for the SNP computation phase of the assembly. The command is designed for use with existing BAM files that have not been analyzed for SNPs, or to re-analyze an existing file with different parameters. Most of the parameters for computeSNP are identical to parameters for assembleTemplate, described above:
| ||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||
|
calcJunctionSeqs |
In the structural variation workflow, specifying 'false' prevents junction sequences from being calculated. |
[true|false] |
| ||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||
|
concurrentAligns |
(Intended for internal use only) |
[number] |
| ||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||
|
file |
(required) Specifies the path and name of one or more .assembly projects from which to compute SNPs. |
[directory/filename enclosed in quotes] |
| ||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||
|
snp_writeMissingDBSnps |
In a SNP assembly, specifying 'false' causes missing SNPs not to be recorded, saving time and file space. |
[true|false] |
| ||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||
|
snpFilter |
Specifies whether SNP filtering is turned on or off.
Properties for snpFilter:
capture: [true|false] Specifies whether there is an exome capture file. If an exon capture file is added in the SeqMan NGen wizard or through a script, this value is set to ‘true.’ In the absence of an exome capture file, the SeqMan NGen wizard automatically sets this property to 'false.'
pNotRefMinVal: [number] In the unusual case that the hard filter is missing, this property is used to set the minimum value that can be displayed in the SeqMan SNP table. Otherwise, this property is ignored. Default is 10.
userOnly: [true|false|All] Specifies whether there is a VCF SNP file. The SeqMan NGen wizard always calls this as ‘true’ (or ‘yes’) but ignores the property if no VCF SNP file has been loaded.
pNotRef: [number] Called “SNP Filter Stringency” in the SeqMan NGen wizard, this specifies a PnotRef threshold. This is a “soft” filter. Data not matching the criterion are removed from the default display of the SeqMan Pro SNP table. This option is only available for the Bayesian SNP calling methods (used when genome ploidy is “Diploid” or “Haploid”). Wizard values include Low (90%), Medium (99%) and High (99.9%).
minSnpFilter: [number] This parameter does not relate to any setting in the SeqMan NGen wizard, but corresponds to “SNP%” in SeqMan Pro and “minSNPFilter” in ArrayStar. In the simple SNP calling method (used when genome ploidy is “Heterogeneous”), the default is 5% for 454 and Ion Torrent read technologies; 1% for all others. In Bayesian SNP calling methods (used when genome ploidy is “Diploid” or “Haploid”), the default depends on stringency and ploidy rather than the read technology. The default for Diploid is 15% for all stringency levels. The default for Haploid is 25% for low stringency, 50% for medium and 75% for high.
minDepth: [number] (option) Specifies a minimum sequence depth threshold. This parameter does not relate to any setting in the SeqMan NGen wizard, but corresponds to “Depth” in SeqMan Pro and “minDepth” in ArrayStar. In the simple SNP calling method (used when genome ploidy is “Heterogeneous”), the default is 50. In Bayesian SNP calling methods (used when genome ploidy is “Diploid” or “Haploid”), the default is 20.
A set of SNP filters used by ArrayStar and SeqMan Pro. codonOnly : [Coding|CodingChange|Nonsense|All] maxDepth: [number] maxCodingFeatureDistance: [number] minSnpFilter: [number] qCall: [number] synonymousCodingChange: [true|false] substitionCodingChange: [true|false] noStartCodingChange: [true|false] noStopCodingChange: [true|false] nonsenseCodingChange: [true|false] frameshiftCodingChange: [true|false] notCodingCodingChange: [true|false] inFrameIndelCodingChange: [true|false] refOnly: [Reference|Unique|All] cosmicOnly : [Yes|No|All] minIndelSize: [number] gerpScore: [number] substitution: [true|false] showIndels: [true|false] |
[true|false] |
(pertains to pNotRef only) Assembly Options: SNP Filter Stringency | ||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||
|
userSNP |
Specifies a location for storing the VCF SNP table. |
[directory/filename enclosed in quotes] |
| ||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||
createGenomeTemplate
(Intended for internal use only) | ||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||
|
file |
Specifies the directory and file/folder of the input file. |
[directory/filename enclosed in quotes] |
| ||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||
|
output |
The path and name of the output file. |
[directory/filename enclosed in quotes] |
| ||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||
diskPath
(required) Defines the default directory where temporary intermediate files from the assembly will be stored. The files can be large with large scale projects. Visit our website to view space requirements for a range of representative projects. | ||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||
|
clean |
Specifies whether or not to clean the merge disk. When automated scripts are being run simultaneously or sequentially, this command can be useful for emptying the merge disk between assemblies. |
[true|false] |
| ||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||
|
pathMac |
Specifies the default path and file name for Macintosh. |
[directory/filename enclosed in quotes] |
| ||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||
|
pathWin |
Specifies the default path and file name for Windows. |
[directory/filename enclosed in quotes] |
| ||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||
|
path |
(required) Specifies the default path and file name.
Example:
diskPath path: “/data/proj/” |
[directory/filename enclosed in quotes] |
| ||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||
dumpConsensus
(Intended for internal use only). To convert the binary consensus file created during assembly into a text file. | ||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||
|
file |
Specifies the directory and file/folder. |
[directory/filename enclosed in quotes] |
| ||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||
dumpSNP
(Intended for internal use only). Creates a tab delimited text file from one or more SNP containing binary files generated during assembly. SNP binary files include those with the .snpExt suffix contained in an .assembly package as well as those with either the .coverage.missingSNP or .nocoverage.missingSNP suffix contained in the _shared folder. To convert all the .snpExt files in a package simply use the .assembly name. | ||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||
|
file |
(required) Specifies the path and name of .assembly package (all SNP files will be included), one or more individual .snpExt files or either/both of the missingSNP files. |
[directory/filename enclosed in quotes] |
| ||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||
|
output |
(required) Specifies the path and name of the output file. |
[directory/filename enclosed in quotes] |
| ||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||
|
refPos_end |
To export SNPs with positions lower than this value. |
[number] |
| ||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||
|
refPos_start |
To export SNPs with positions higher than this value. |
[number] |
| ||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||
|
snp_maxProbNonrefToCall |
Lower limit for probability scores for exported SNPs. |
[number] |
| ||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||
|
snp_minProbNonrefToCall |
Lower limit for probability scores for exported SNPs. |
[number] |
| ||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||
|
snp_type |
Specifies which SNP file from the .assembly to use as an input. |
[simple|SNP|missing|user|stats|userIDOnly] |
| ||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||
|
templateID |
Defines the template for which the SNP will be exported. |
[number] |
| ||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||
|
onefile |
Defines whether all SNPs should be placed into one file. |
[true|false] |
| ||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||
exportSplits
(Intended for internal use only). To convert the binary splits file created during assembly into a text file. | ||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||
|
file |
Specifies the directory and file/folder. |
[directory/filename enclosed in quotes] |
| ||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||
|
output |
The path and name of the output file.
|
[directory/filename enclosed in quotes] |
| ||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||
execute
Executes any shell script command. | ||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||
|
command |
Text for any shell script command. |
[text string] |
| ||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||
exportVCF
Accepts the exome capture file and VCF file and builds another VCF file containing SNPs only in the capture regions. | ||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||
|
userSNP |
User SNP file. |
[directory/filename enclosed in quotes] |
| ||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||
|
exomeCapture |
file: [directory/filename enclosed in quotes] Exome capture file.
track: [text string] The name of the region of interest. |
|
| ||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||
|
output |
The output VCF file. |
[directory/filename enclosed in quotes] |
| ||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||
extractPairs
Creates a tab delimited table of pair end information. | ||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||
|
file |
The path and name of any pair distance file (.pairdist file) from within a project's shared folder. |
[directory/filename enclosed in quotes] |
| ||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||
|
output |
The path and name of the output file. |
[directory/filename enclosed in quotes] |
| ||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||
include
When building a script, this command can be used to call up additional lines of script previously stored in a text file. In this way, a group of commands can be shared between two or more scripts. | ||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||
|
file |
Specifies the directory and file/folder. |
[directory/filename enclosed in quotes] |
| ||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||
loadAssembly
(Intended for internal use only) | ||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||
|
file |
Specifies the directory and file/folder. |
[directory/filename enclosed in quotes] |
| ||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||
loadBAM
Sets parameters for analyzing existing BAM files. It allows ungapped BAM files to be converted into a fully gapped assembly file or to re-gap an existing file with different parameters. The command also permits SNPs to be calculated or re-calculated with different parameters starting with an existing BAM file. The associated parameters are also available for full assemblies and are described under the assembleTemplate command, near the top of this table:
| ||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||
mergeIonTorrentShortReads
When using Ion Torrent data, use of this command merges overlapping short reads into mini-contigs. | ||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||
|
output |
(required) Specifies the path and directory of the output files. |
[directory/filename enclosed in quotes] |
| ||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||
|
query |
(required) Specifies the directory and file name(s) of the query data to be assembled. A folder with one or data files can also be used in place of individual file names. |
[directory/filename enclosed in quotes] |
| ||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||
message
Writes out the string to the standard output. | ||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||
|
str |
Specifies the string to be written to the standard output. |
[text string] |
| ||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||
pairFilePattern
Allows you to specify the pattern for pair files using the GREP language.
Example:
pairFilePattern forward: “(?'name'.*)_R1_(?'ext'.*)\fastq reverse: “(?'name'.*)_R2_(?'ext'.*)\fastq | ||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||
|
forward |
A naming pattern to match forward clones. |
[text string enclosed in quotes] |
| ||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||
|
reverse |
A naming pattern to match reverse clones. |
[text string enclosed in quotes] |
| ||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||
pause
Creates a pause and can be used when running table scripts to stop at any point.
Example:
pause prompt: “Table script paused. Press enter to continue.” | ||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||
|
prompt |
Text to appear in the console. The pause is terminated by hitting the Enter key. |
[text string enclosed in quotes] |
| ||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||
quit
Terminates a script. | ||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||
RemoveDuplicateSeqs
Coalesces multiple identical reads at the same position into a single read, provided the reads match the template exactly. If this feature is active, at the end of assembly, XNG will print the message: “Coalesced $lld identical reads that matched the template exactly.” Allowable values are [true|false]; default is false. | ||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||
runScript
Allows batching of multiple projects of the same type (e.g. assembly, computeSNPs). There are required three file: 1) a runScript file with variables, 2) a file with a table of values for the variables, and 3) a script file specifying the action to be carried out.
Example (runScript file):
setDefaultDirectory directory: “.” set $force: false set $DataDisk: “/Volumes/Raid/DataDisk” set $ResultDisk: “/Volumes/ResultDisk” set $MergeDisk: “/Volumes/MergeDisk0” set $snp:true set $snpMethod:”Diploid” set $repCnt:100 set $merLayoutMin:19 diskPath path: {“${MergeDisk}/mergeSort Data”}} runScript table: “testAssembly.txt” script: “testAssembly.template.script”
Example (table file):
defaultDir template query isPair seqTech project merSize snp snpMethod “${ResultDisk}/rice” ${DataDisk}/rice.genome ${DataDisk}/rice FALSE Illumina rice 21 TRUE Diploid “${ResultDisk}/ecoli” ${DataDisk}/Ecoli.gbk ${DataDisk}/ecoli TRUE Illumina Ecoli 21 TRUE Diploid “${ResultDisk}/Exome” ${DataDisk}/GRCh37.gbk ${DataDisk}/Sample1 FALSE 454 HuEx 19 TRUE Diploid
Example (script file):
; “assembly.template.script” setMachineMemory memory:32 setDefaultDirectory directory: $defaultDir compareSeqs template: $template query: {file: $query isPair: $isPair seqTech: $seqTech} directoryMer: “intermediateFiles” ; directoryQueryMer: “intermediateFiles” hits: “intermediateFiles/${project}.hits” layout: “intermediateFiles/${project}.layout” output: “results_${mersize}_${merSkipQuery}/${project}” ; results per project results: “${project}.results.txt” ; aggregate all results results: “${ResultDisk}/assembly.results.txt” merSize: $mersize merSkipQuery: $merSkipQuery repeatCnt: $repCnt merLayoutMin: $merLayoutMin layoutType: once maxGap: 6 format: BAM onePackage: true snp: $snp snpMethod: $snpMethod ; snp_writeExtended: true forceMake: $force | ||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||
|
script |
The filename and location of the script. |
[directory/filename enclosed in quotes] |
| ||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||
|
table |
The filename and location of the file containing text strings and numbers values for each variable. |
[directory/filename enclosed in quotes] |
| ||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||
|
inline |
Executes the list of commands and parameters. |
|
| ||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||
set
Used to set variables. See the example below and those under the runScript command.
Example:
set $snp:true set $snpMethod:”Diploid” | ||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||
setDefaultDirectory
(required) Defines the default directory for the project. When a default directory is specified, files located in that directory only need to be identified by their subfolder and/or file name in subsequent commands.
Example:
setDefaultDirectory directory: “/data/home/proj/” | ||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||
|
directory or defaultDirectory |
(required) Specifies the default directory. Previously called defaultDirectory. |
[directory/filename enclosed in quotes] |
| ||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||
|
directoryMac or defaultMacDirectory |
Specifies the default directory for Macintosh. Previously called defaultMacDirectory. |
[directory/filename enclosed in quotes] |
| ||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||
|
directoryWin or detaultWinDirectory |
Specifies the default directory for Windows. Previously called defaultWinDirectory. |
[directory/filename enclosed in quotes] |
| ||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||
setMachineMemory
Defines the amount of random access memory (RAM) that the program will use. Limiting the amount of RAM available to the assembler allows you to use the computer for other purposes while an assembly is running. However, this will likely slow down the assemblies and is not recommended for large projects.
Example:
setMachineMemory memory: 32 | ||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||
|
memory |
(required) Amount of RAM (in GB) to be used, entered in multiples of four. Entering a value greater than the available RAM causes all RAM to be used. |
[number that is a multiple of 4] |
| ||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||
setParam
Adjusts the stringency of one or more of the assembling parameters for the project. SeqMan NGen will use the default values for any parameter that is not specified within the script.
All of the parameters for setParam are identical to parameters for assembleTemplate, described near the top of this table:
delayAlignInserts gapPenalty increaseRunGapPen matchScore minAlignedLength minMatchPercent mismatchPenalty removeUniqueInserts |