The .assembly package is part of the output for XNG workflows. (The contents of the -noSplit.assembly package are similar to those of the .assembly package.)

In the file names below, the project name should be understood to precede any hyphen (-) or period (.) used at the beginning of file and folder names.

File Suffix Description
It is intended that the entire .assembly package be opened in SeqMan Pro or SeqMan Ultra for viewing and analysis of the assembly. However, the following individual files also contain useful information.
.vcf A VCF file (.vcf) is automatically created for all assemblies with variants. The file is modified in three ways adhering to the Variant Call File (VCF) v. 4.2 specification:

* In the FILTER field, each row is marked with one of three qualifiers to show whether or not a position was covered:
** “PASS” for positions where a call could be determined based on the sequence read data.
** “NC” for positions with no sequence read coverage (this will be denoted at the top of the file under ##FILTER.)
** “.” for positions when data for a call is missing or a call could not be made.

These changes to the FILTER field apply to both single-sample and multi-sample VCFs, but not to VCFs lacking any sample information.

* In the QUAL field, a Phred-scaled quality score is provided for the assertion made in the ALT column. The score is calculated as -10 log10 prob (call in ALT is wrong).
** In rows where the ALT column contains ‘.’ (i.e. no variant was called), the column contains -10log10 prob(variant).
** In rows where the ALT does not contain ‘.’ (i.e. a variant call), the column contains -10log10 prob(no variant).
** A missing value is specified as “.”

* The PA field contains the Pnotref value. Note that the QUAL scale is reversed relative to Pnotref when ALT is "."; that is, when a position is in the reference. However, in one direction or the other, it will scale logarithmically with Pnotref. This does mean that it will be closer to Qcall (or "GQ") in cases where there isn’t “homozygous vs. heterozygous” call ambiguity. However, when the ambiguity is present, it will diverge.
.bed, .txt, etc. The target region file (.bed or manifest) for the assembly, if one was specified.
.templateInfo Contains general information for each contig in the assembly.
.enrichment_Summary.txt Contains the textual information for the Project > Show coverage of target regions option in SeqMan Pro and SeqMan Ultra.
.sqd This file is only created when the .assembly is first opened in SeqMan Pro or SeqMan Ultra. It contains saved display specific information such as SNP filtering criteria. Double-clicking on this file will open the .assembly package in SeqMan Pro or SeqMan Ultra.
-Transcriptome table folder containing the file .table.txt This folder and its file, showing the putative gene identity for each transcript, are created for the de novo transcriptome RNA-seq workflow only.
There is normally no reason to open the following files.
-0.assemblyInfo Contains information about assembly parameters which can be used for combining multiple assemblies. This file is not present for SeqMan NGen assemblies made prior to version 14.0. In 14.0 and later, it is present in templated miRNA, ChIP-Seq, and RNA-Seq workflows.
[project name]Transcriptome.table.txt This file is present in RNA-seq workflows that used a .Transcriptome package as a template. It is equivalent to the .table.txt file in the Transcriptome table folder of the .Transcriptome package.
.auxPair (internal use only)
.bam The BAM formatted alignment file.
.bam.bai The BAM index file.
.capture.userSNP.vcf (internal use only)
.combined.snpExt (internal use only)
.coverage Contains information at each position along the contig where the coverage changes.
.coverage2 Contains information for the maximum coverage of 100 base pair intervals across the contig.
.coverage4 Contains information for the maximum coverage of 10,000 base pair intervals across the contig.
.coverage.missingSNP Contains information about positions in dbSNP that had coverage and were called the reference base in the assembly.
.exomeCapture-features (internal use only)
.info Contains files used by SeqMan Pro and SeqMan Ultra.
.midinfo (internal use only)
missing.fas A fasta file of reads with no mers matching the reference.
missing.fas.qual A base quality file of reads with no mers matching the reference.
.nocoverage.missingSNP Contains information about positions in dbSNP that had no coverage in the assembly.
outofOrder.txt A text file of sequence reads not included in the final assembly due to excessive trimming during the alignment phase.
.pair (internal use only)
.pairDist Contains information about the position and distance between paired end reads.
pairSpecifiers.txt (internal use only)
poor.fas A fasta file of reads rejected at the layout phase due to match scores below the threshold.
poor.fas.qual A base quality file of reads rejected at the layout phase due to match scores below the threshold.
.quant Reprises information in the .coverage4 .coverage2 and /or .coverage files.
.region_capture.bed (internal use only)
report.txt Contains the textual information for the Project > Report option in SeqMan Pro. See View the Project Report for information about the report contents for XNG and SNG workflows.
.snp Contains all the information for SNPs called using the “Simple” method.
.snpExt Contains all the information for SNPs called using either the “Diploid” or “Haploid” method.
SNPs.log An optional text form of the .snpExt table that contains information on how each was calculated. If you encounter a problem, this file is useful for DNASTAR Support to help you with trouble-shooting.
.splitExt (internal use only)
.template-comment Contains the comment information for that contig.
.template-features Contains the feature information for that contig.
.template-features2 (internal use only)
.template.fof A file-of-files containing the path and file names of the reference sequences.
.template-gapped-seq A .seq file of the template containing gaps.
.template-gaps A binary file of the template gap information.
.template-seq A .seq file of the template without gaps.
unaligned.fas.qual A base quality file of reads rejected at the alignment phase.

Need more help with this?
Contact DNASTAR

Thanks for your feedback.