Saving Variant Report Information to a VCF Variant File

You can save selected variants from your projects in a VCF Variant Table (.vcf), with one such file allowed per template package. In subsequent assemblies, any template package containing a VCF Variant Table will report cases in which the same position is called as a variant, analogous to dbSNP.

 

To create or append to a VCF Variant Table from within SeqMan Pro:

 

1)  Open the Variants Summary report via the Variant > Variant Report command.

 

2)  Confirm one or more variants in the report using Variant > Confirm Variant. Multiple rows may be selected using Shift+click or Ctrl/Cmd+click. Confirmed variants are denoted with a checkmark (ü) in the SNP column.

 

3)  Choose Variant > Append Checked Variants to VCF.

 

The VCF Reference Variants From table opens automatically with the new or appended variant(s) highlighted.

 

 

The table has the following columns:

 

Column Name

Description

Contig ID

 The name of the contig or chromosome containing the variant.

Ref Pos

The position of the variant in reference coordinates.

dbSNP

 The rsID, if the position corresponds to a dbSNP entry.

User ID

 An automatically generated number identifying the entry in the VCF Variant table.

Source Assembly

 the name of the last assembly from which the position was added to the table. In cases where an existing position in the table is added again from a new assembly, the source assembly is changed, but the other information is not.

Filter

Each row is marked with one of three qualifiers to show whether or not a position was covered:

 

      “PASS” for positions where a call could be determined based on the sequence read data.

 

      “NC” for positions with no sequence read coverage (this will be denoted at the top of the file under ##FILTER.)

 

      “.” for positions when data for a call is missing or a call could not be made.

 

These changes to the FILTER field apply to both single-sample and multi-sample VCFs, but not to VCFs lacking any sample information.

Qual

A Phred-scaled quality score is provided for the assertion made in the ALT column. The score is calculated as -10 log10 prob (call in ALT is wrong).

 

      In rows where the ALT column contains ‘.’ (i.e. no variant was called), the column contains -10log10 prob(variant).

 

      In rows where the ALT does not contain ‘.’ (i.e. a variant call), the column contains -10log10 prob(no variant).

 

      A missing value is specified as “.”

PA

The Pnotref value. Note that the QUAL scale is reversed relative to Pnotref when ALT is "."; that is, when a position is in the reference. However, in one direction or the other, it will scale logarithmically with Pnotref. This does mean that it will be closer to Qcall (or "GQ") in cases where there isn't “homozygous vs. heterozygous” call ambiguity. However, when the ambiguity is present, it will diverge.

 

Note: Any duplicate entries are automatically merged into a single entry.

 

To create, edit or append to a VCF Variant Table using a text editor:

 

Tables created by SeqMan Pro are stored as a text files in the genome template package with which the assembly was created. All entries can be edited except the Contig ID and Ref Pos columns, which must correspond to the template sequences. Similarly, new tables must include the Contig ID and Ref Pos columns, at a minimum. If no User ID is specified, SeqMan Pro will create one automatically the first time the template package is used.

 

Additional columns with custom data or notes can also be added. However, if duplicate entries are added from a later assembly, the added column information will be lost.

 

To view the VCF Variant Table:

 

To view the table from within SeqMan Pro without appending variants, use the Variant > Show VCF Variants command.

 

To export the VCF Variant Table to use with other applications:

 

See Exporting Variant Data.