The following tutorial is separated into four parts for clarity, but the entire tutorial only requires about 15 minutes of “hands-on” time. Between Parts A & B, there is also a 12-24 hour wait for the prediction to take place.


Part A: Use NovaFold to predict a structure based on a linear sequence:

  1. Using the browser of your choice, go to https://www.rcsb.org/structure/1JTG. From the Download Files menu, choose FASTA sequence. Save to any convenient location, such as your desktop. By default, the filename is rcsb_pdb_1JTG.fasta.
  1. Launch Protean 3D.
  1. In the Welcome screen, click Structure Prediction on the left. Then look at the topmost option in the middle section of this screen.

    • If you are already logged into NovaCloud Services, you will see New protein structure with NovaFold. Click this link to open the NovaFold wizard at the Sequences screen.

    • If you are not logged in to NovaCloud Services, all the links on the right simply say Log in to NovaCloud Services with your DNASTAR Account. Click on any of these links to open the Services view. Enter your login credentials and press Log in. Then click the topmost box, NovaFold, to open the NovaFold wizard at the Sequences screen.
  1. Click Add File and open the FASTA file you downloaded. Note that two jobs are listed in the table; one for each chain. Select the job named 1JTG_2|Chains and click Remove to leave just one chain.

Click Next.

The Options screen provides a variety of advanced options for customizing the prediction algorithm. In the next step, you will change a setting that will remove template structures similar to your query sequence from consideration. This provides you with assurance that we did not choose the query sequence for this tutorial because it is similar to one in our template structure library.

  1. Near the bottom of the Options screen, check the box next to Remove sequences at 90% identity. Then choose 50 from the drop-down menu to prevent NovaFold from using any template that matches over 50% of the query sequence.

  1. Click Next.

The Submit screen shows how many predictions you will be submitting to NovaCloud services; in this case, 1 prediction.

  1. Click Submit. When the Close button is enabled, press that button.
  1. Use the Prediction tab to monitor the progress of the job. A blue triangle in the Status column shows that the prediction is in progress.

This prediction generally takes 12-24 hours to complete. If you close Protean 3D and come back to check the status later, you can open the Predictions view using View > NovaCloud Services > Predictions.

Once the job status is complete, the Status column will display a blue link.


Part B: Viewing the results and visualizing the alignment of the prediction and templates:

  1. Click on the link to open the results in a special version of the Report view.
  1. Look at the Templates section, which shows the sequence similarity of a query sequence to the top ten template sequences. Templates are ranked based on a combination of three values: Z-Score, %Coverage and %ID. A threading alignment whose normalized Z-Score > 1 indicates a good alignment, while higher is better for both %Coverage and %ID scores.

  1. Click the link below the table, Send template alignment to MegAlign Pro. The sequences open with the alignment already completed. The query sequence is at the top, with the rank 1-10 templates just below, in that order.
  1. To highlight the differences between the templates and the query sequence, use the Multiple Alignment section of the Style panel on the right (View > Style > Multiple Alignment).

    1. In the Reference menu, choose 1JTG_1Chains.

    2. Under Alignment coloring, choose Show only differences from reference.

The Sequences view now appears as below.

The reason for the large number of disagreements is that you told NovaFold to “Remove sequences at 50% identity” in Part A, Step 5 of this tutorial. In real life, you would normally allow NovaFold to use templates with a higher percent identity to achieve the best accuracy.

  1. Close MegAlign Pro without saving the results and return to Protean 3D.


Part C: Viewing and corroborating NovaFold’s “protein function” prediction:

  1. Returning to Protean 3D, look at the Model Overview section. This table compares the structural similarity of a NovaFold prediction to experimental structures in the RCSB Protein Data Bank (PDB). Model 1 has a TM-Score of 0.89, shown in green. Green indicates that the confidence in this structure prediction is appropriate for drug design, drug screening, ligand docking, and molecular replacement.
  1. Now look at the Model 1 section just below the table. To get a better look at the model, click Spin the model.
  1. Scroll down to the end of the Model 1 section. Even though NovaFold was given only a protein sequence as its starting point, its machine learning-based analysis predicted that the GO Molecular Function of the folded protein involved beta-lactamase activity. As confirmation of this prediction, the PubMed Abstract for 1JTG (on its PDB page) corroborates that 1JTG is a beta-lactamase inhibitor protein.


Part D: Comparing the predicted structure to the version based on x-ray crystallography:

  1. Scroll back up to the section with the spinning Model 1. Click the link Open model in new document. This opens the model as a structure in Protean 3D. You can use the views, apply tracks, etc. just like with any structure.
  1. Choose Structure > Align Structures > Structure Alignment. Note that the 1JTG_1Chains-1 structure is already included in the File list. Press the Add PDB ID button. Type the name 1JTG. This directs NovaFold to download the structure file that was derived from x-ray crystallography from the PDB website.

  1. Press OK, then press Next.

The right side of the screen shows that the two items are ready for alignment. Press Run or Run Now.

  1. On the right of the window, open the Molecules (View > Explorer > Molecules) and the Color (Style > Color) areas so you can see both at the same time. Note that the predicted chain A is overlaid on the entire structure downloaded from PDB.
  1. To display only the Chain A data, go to the Molecules area and uncheck all the boxes except for 1JTG_1Chains-1 and 1JTG > A.
  1. Now color the predicted version yellow and the PDB version blue, as follows:

    1. In the Molecules section (View > Explorer > Molecules), select the predicted structure by clicking on NovaFold predicted model 1.

    2. In the Color section (View > Style > Color), click the colored box below Fill and choose any shade of yellow from the color picker.

    3. In the Molecules section, select the A chain of the PDB structure by clicking on BETA LACTAMASE TEM to the right of A.

    4. In the Color section, click the colored box below Fill and choose any shade of blue from the color picker.

    5. Double-click anywhere in the black area of the Structure view to clear the selection and allow you to see the colors better.
  1. Spin and examine the overlaid structures, noting that NovaFold’s prediction closely matches the structure discovered using x-ray crystallography. This is even more remarkable knowing that you artificially blocked NovaFold from being able to use templates that were more closely related to the query sequence (in Part A, Step 5).

  1. In the Report view (View > Report > Show), look at Report 2: Structure Alignment Report. Note that the RMSD [Å] for the alignment is 1.544. Low RMSD values like this one indicate very good matches, with a value of zero representing a perfect match.

This is the end of the tutorial.

Need more help with this?
Contact DNASTAR

Thanks for your feedback.