The Validation Control Accuracy Report

The report for the Validation Control Accuracy statistical test can be generated in either of two ways:

 

      From within SeqMan NGen, by pressing the Validate SNPs button in the “Project Report” screen. This launches ArrayStar with the report active.

 

      From within ArrayStar, by using the Statistics > Validation Control Accuracy menu command.

 

A Validation Control Accuracy report is generated from the collection of comparisons made between the two experiments. Results are displayed in simple text format within an ArrayStar view tab named Validation Control Accuracy.

 

The report header contains the information shown below:

 

Image

Description

The names of the standard and the Validation Control experiments.

Calculation start and finish times.

      The experiments assigned as the Validation Control and Standard.

 

      The version number of SeqMan NGen that was used in assembly.

 

      The version number of ArrayStar that was used in Validation Control calculation.

Summary of all targeted positions:

 

      In Standard VCF - the number of positions in the VCF file that are within the targeted regions. These will typically represent the known variant positions.

 

      Not in Standard VCF - the number of positions within the targeted regions that are not present in the VCF file. These will typically represent the known non-variant positions.

 

      Total - the total number of positions in the targeted regions. This is the sum of the two values above.

Summary of all targeted zero coverage positions in experiment:

 

      In Standard VCF - the number of positions in the VCF file that are within the targeted regions with no overlapping data reads in the control experiment.

 

      Not in Standard VCF - the number of positions within the targeted regions that are not present in the VCF file, with no overlapping data reads in the control experiment.

 

      Total - the total number of positions in the targeted regions with no overlapping data reads in the control experiment. This is the sum of the two values above. None of these positions are considered in the accuracy calculations.

SNAGHTML8a3880

A brief summary of results for the SNP stringency filter, which was set to “high” in this example: minimum depth of coverage = 10, PNotRef ≥ 0.90.

 

Note: If stringency was not specified in the project, ArrayStar uses a minimum depth of coverage = 10 and PNotRef ≥ 0.75.

 

In the example at left, only the 661 total positions in the targeted regions (standard SNPs) with a minimum depth of 10 are included. Of those, only the 657 positions with PNotRef ≥ 0.90 are considered “true positives.” The 4 targeted positions with a minimum depth of 10, but PNotRef < 0.90 are considered to be called reference bases (“false negatives”).

 

Notice that the number of standard SNPs (661) plus the number of positions with a depth of coverage below 10 (32) equals the total number of targeted positions In Standard VCF (693, two rows up).

A breakdown of true/false positives/negatives. See table below for descriptions.

The calculated statistics. See table below for descriptions.

 

The Summary Grid table contains the following information:

 

Column

Description

Min. Depth

The minimum value for a position to be included in the calculation. Results are shown for six levels of Depth: 1, 10, 20, 30, 50 and 100.

PNotRef

The minimum value for a position to be included as a “positive.” For each level of minimum depth, results are shown for three levels of PNotRef: 0.50, 0.75 and 0.90.

T/P

True Positives (TP) –Positions meeting the depth and PNotRef criteria for called variants in the assembly that also occur in the VCF.

F/P

False Positives (FP) - Positions meeting the depth and PNotRef criteria for called variants in the assembly that do not occur in the VCF.

T/N

True Negatives (TN) - Positions meeting the depth but not the PNotRef criteria for called variants in the assembly that do not occur in the VCF.

F/N

False Negatives (FN) - Positions meeting the depth but not the PNotRef criteria for called variants in the assembly that occur in the VCF.

Total

The total number of positions meeting the minimum depth.

The columns below are calculated based on the values in previous columns:

FPR

False Positive Ratio (FPR) = FP / (FP + TN)

FDR

False Discovery Rate (FDR) = FP / (FP + TP)

Sens.

Sensitivity = TP / (TP + FN)

Spec.

Specificity = TN / (FP + TN)

Bal. Accuracy

Balanced Accuracy = (Sens. + Spec.) / 2