In this tutorial, you will use MegAlign Pro to compare an experimental SARS-CoV-2 sample to references sequences from four SARS-CoV-2 strains in order to see which strain is present in the sample. This type of analysis can be used with any viral genome samples that have been sequenced and assembled into an accurate draft genome by your sequencing core.

  1. Download the tutorial data and extract it to any convenient location (i.e., your desktop).
  1. Launch MegAlign Pro. From the Welcome screen, choose New blank alignment project.
  1. Use File > Add Sequences to add the four COVID-19 .fasta files from the tutorial data folder. Use the same command to add the Spike variant.fasta file created in Part A. If you did not do Part A of the tutorial, instead add the file Spike-var.fasta from the tutorial data folder.

If a popup asks the file type, specify the sequences are DNA.

  1. Use Align > Align Using MAFFT to perform a multiple alignment. The alignment will finish in 1-2 minutes.
  1. To generate a phylogenetic tree, click the Tree tab at the bottom of the view. Select Maximum Likelihood: RAxML. Keep all defaults and press OK. The tree will take 1-2 minutes to generate.
  1. Once the tree is displayed, move the green slider at the top of the Tree view all the way to the right to better differentiate the branches. There are four strains represented, and these may appear in any order as you scroll down the tree.
Reference group name SARS-CoV-2 strain
B.1.1.7 Alpha
B.1.351 Beta
B.1.617.2 Delta
C.37 Lambda
  1. Differentiate the strains even more by applying a different background color to sequences from each strain. To do this:

    1. In the Tree view, select all members of a strain by clicking the horizontal root anchoring its clade.

    2. Choose View > Style > Tree to open the part of the Style panel that controls the appearance of the tree.

    3. In the Style panel, click the Background box to add a checkmark, then click the white box to the right to choose a new shade from the color chooser.

      In the image below, the Beta strain was assigned a green background and the Delta strain was assigned a salmon background.
  1. In the Tree view, locate SRR13380669_NC_045512.2, the experimental draft genome exported as Spike variant.fasta at the end of Part A or opened in Part B as Spike-var.fasta from the tutorial data folder. Note that this strain is a close relative of the “Beta” strain, which was first noted in South Africa in January 2021.

  1. To confirm this observation, open the Distance table by clicking the Distance tab at the bottom of the window. Select the experimental sample row, SRR13380669_NC_045512.2 (typically the bottom-most row), then choose Distance > Order Sequences by Distance from Selection.
  1. The lower-left triangle of the table shows distances between sequences, with white being most related and dark red being least related. Note that all visible distances in the visible portion of this triangle are “0.00.” To see differences between the sequences more clearly, locate the Distance section of the Style panel (on the right) and change the Decimal places from 2 to 4 (or 5).

Scrolling down, notice that most of the closely related sequences—shown in white or very light pink—are from the “Beta” strain B.1.351. Two random examples within these light areas are shown in the image below.

