This section of the tutorial is self-contained, and can be completed even if you did not do Part A.

In this section of the tutorial, you will align the experimental consensus sequence exported at the end of Part A with references sequences from four SARS-CoV-2 strains in order to see which strain is present in the sample. If your samples are sequenced and assembled into an accurate draft genome by your sequencing core, this type of analysis would be your usual starting point; you would not need to perform the steps in Part A. If you started with raw long reads and assembled them in Part A, then this portion of the tutorial represents an optional type of downstream analysis in addition to those discussed in Part A.

  1. If you have not already done so, download the tutorial data and extract it to any convenient location (i.e., your desktop).
  1. Launch MegAlign Pro. From the Welcome screen, choose New blank alignment project.
  1. Use File > Add Sequences to add the four COVID-19 .fasta files from the tutorial data folder. Use the same command to add the Spike variant.fasta file created in Part A. If you did not do Part A of the tutorial, instead add the file Spike-var.fasta from the tutorial data folder.

If a popup asks the file type, specify the sequences are DNA.

  1. Use Align > Align Using MAFFT to perform a multiple alignment. The alignment will finish in 1-2 minutes.
  1. To generate a phylogenetic tree, click the Tree tab at the bottom of the view. Select Maximum Likelihood: RAxML. Keep all defaults and press OK. The tree will take 1-2 minutes to generate.
  1. Once the tree is displayed, use the green slider at the top of the Tree view to better differentiate the branches, if necessary. There are four strains represented, and these may appear in any order as you scroll down the tree.
Reference group name SARS-CoV-2 strain
B.1.1.7 Alpha
B.1.351 Beta
B.1.617.2 Delta
C.37 Lambda
  1. Differentiate the strains even more by applying a different background color to sequences from each strain. To do this:

    1. In the Tree view, select all members of a strain by clicking the horizontal root anchoring its clade.

    2. Choose View > Style > Tree to open the part of the Style panel that controls the appearance of the tree.

    3. In the Style panel, click the Background box to add a checkmark, then click the white box to the right to choose a new shade from the color chooser.



      In the image below, the Beta strain was assigned a green background and the Delta strain was assigned a salmon background.
  1. In the Tree view, locate SRR13380669_NC_045512.2, the experimental draft genome exported as Spike variant.fasta at the end of Part A or opened in Part B as Spike-var.fasta from the tutorial data folder. Note that this strain is a close relative of the “Beta” strain, which was first noted in South Africa in January 2021.

  1. To confirm this observation, open the Distance table by clicking the Distance tab at the bottom of the window. Select the experimental sample row, SRR13380669_NC_045512.2 (typically the bottom-most row), then choose Distance > Order Sequences by Distance from Selection.
  1. The lower-left triangle of the table shows distances between sequences, with white being most related and dark red being least related. Note that all visible distances in the visible portion of this triangle are “0.00.” To see differences between the sequences more clearly, locate the Distance section of the Style panel (on the right) and change the Decimal places from 2 to 4 (or 5).

Scrolling down, notice that most of the closely related sequences—shown in white or very light pink—are from the “Beta” strain B.1.351. Two random examples within these light areas are shown in the image below.

Need more help with this?
Contact DNASTAR

Thanks for your feedback.