In Part A, you used SeqMan NGen to perform the Sanger validation assembly. In this part, you will use SeqMan Pro to visualize the results and to validate the Illumina data using the Sanger trace data.

  1. If you came to this topic directly from Part A, the file Sanger Validation.assembly will already be open in SeqMan Pro. Otherwise, launch SeqMan Pro and use File > Open to open it.
  1. Expand the Report window. The upper half of the report shows the numbers of assembled and unassembled reads, and subsets of these.

The next phase of analysis in SeqMan Pro is to look at variants. The Variant Report (Variant > Variant Report) was used for this purpose in Tutorial 1, Part B. However, the Variant Report is less helpful for the current Sanger Validation assembly, as the assembly contains over 54,000 variants.

Rather than using the Variant Report, you will instead use the Alignment View to visualize the variant results. The Sanger Validation workflow presumes you know where the Sanger reads of interest are located. In this tutorial, you will investigate variants and a potential deletion near position 2,827,522.

  1. In the Project window (“Sanger Validation.assembly”), double-click on the sole contig name to open it in the Alignment View.
  1. In the top left area of the window, click the triangle next to Consensus to instead display the Reference and Majority sequences.

The Majority row only shows bases that differ from those in the Reference (i.e. variants).

  1. Choose Variant > Show Variants. This causes most of the sequence text to fade to light gray, leaving the variants — shown in standard red text — more visible.
  1. Use Edit > Go to Position. Enter a position of 2827522, select the lower radio button and press OK.

The Alignment View is now centered at the chosen position. On the left, note that Illumina and Sanger sequences appear in separate groups, with Illumina data above Sanger.

The current position marks the beginning of a gap in the contig (boxed in red, below) that is not covered by the Illumina paired-end data. In addition, blue letters to the left and right of the gap denote the presence of variant locations in which the Illumina consensus does not match the reference. The Sanger data can be used to corroborate these variants and also allows closure of the gap in the Illumina reads.

Next, you will review the Sanger trace data to see if it supports the putative variants ‘C’ and ‘G,’ visible to either side of the boxed area, above. These are positions 2,827,519 and 2,827,561, respectively.

  1. To more easily view the Sanger results, hide the Illumina data by clicking the triangle to the left of Illumina. Then zoom in as far as possible using the Zoom In tool (), located in the top left corner of the Alignment View.
  1. Display trace data for the Sanger sequences by clicking the triangles to the left of each file name. In the image below, the locations of the two putative Illumina variants have been boxed in red.



    Trace data peaks are colored as follows: A=green, G=black, T=pink/red, C=blue. Do the peak shapes and colors in the areas of interest above corroborate the variants at each position?

    • The left peak is called as ‘C’ in 5/5 Illumina sequences and 2/2 Sanger sequences, and is validated by the well-shaped blue peaks in the Sanger trace data.

    • The right peak is called as ‘G’ in 12/12 Illumina sequences and in one of two Sanger sequences. The remaining Sanger sequence calls the position as ‘R,’ which is the ambiguity code for A/G. If you examine the peak associated with the ambiguous call, there is a clear black ‘G’ peak at this position; it overlaps the shoulder of a neighboring ‘A’ peak. Therefore, ‘G’ is probably the correct call.
  1. To confirm the ‘C’ variant, select its column by dragging across the consensus at that position. Right-click on the selection and choose Variant > Confirm Variant. Do the same for the ‘G’ variant.
  1. To see the confirmed variants in the Variant Report, use Variant > Variant Report and click Show All. The confirmed variants appear at the top of the table with checkmarks indicating their statuses. Note that the Impact column shows that both variants are “Synonymous.”

  1. To create a VCF file containing the confirmed variants, choose Variant > Append Checked Variants to VCF. This file can be exported, if desired, using File > Save VCF Variant Report.

Congratulations on completing Tutorial 7. For a complete list of our other available tutorials, click here.

Need more help with this?
Contact DNASTAR

Thanks for your feedback.