Alanine scanning benchmarks for protein design in DNASTAR Lasergene
With Lasergene 16, we introduced alanine and serine hot spot scanning for protein design. These methods allow you allow you to identify important residues in protein folding and are an important first step in many protein design experiments. Hot spot scanning in Lasergene can be completed via a simple point-and-click workflow, and usually takes a few seconds to a few minutes to complete, depending on your parameter settings.
In this article, we explore accuracy of alanine hot spot scanning in multiple software applications, and show you how to perform this workflow with your own data in Protean 3D, part of the Lasergene Protein package.
How accurately does alanine scanning in Lasergene predict changes in protein fold stability?
To answer this question, we compared experimentally determined thermodynamic stability data for alanine substitution mutations at nearly every position in the β1 domain of Streptococcal protein G (G β1) to in silico calculations in Lasergene Protein, as well as calculations from five other software tools. Our results show that these tools vary widely in the accuracy of hot spot detection, especially at low error thresholds.
Compared to PopMuSic, FoldX, and three Rosetta methods, Lasergene has the most correct predictions within all error tolerances at or below 1.5 kcal/mol.
The Lasergene Protein alanine hot spot scanning method provides the most accurate prediction of energy change in the G β1 protein, with the tightest tolerance of any tool studied
Based on our error tolerance analysis, DNASTAR’s Lasergene Protein produces the most predictions within the lowest error tolerances compared to the experimentally calculated change in energy value. This error analysis considers absolute error: the magnitude of the difference between the predicted and actual change in fold stability.
DNASTAR predictions from Lasergene Protein (Figure 1; top in blue), are the most accurate across all tolerances, even at the lowest error thresholds.
In Figure 2, Alanine variants at each of 44 positions within the G β1 are sorted by the experimental energy change value, with the most stabilizing mutations at the top. The magnitude of absolute error for each of the scanning tools is indicated by color, green being lowest error and red being the highest error. The color representing absolute error for DNASTAR hotspot predictions is also mapped onto the G β1 structure shown in Figure 3.
For a set of 44 alanine variants in the G β1 data set, Lasergene Protein predictions have a Pearson linear correlation coefficient of 0.72 for predicted versus actual changes in fold stability, well ahead of FoldX and three Rosetta methods (at 0.47, 0.30, 0.49, and 0.61, respectively) and comparable to PopMuSic at 0.75.
Lasergene Protein is also shown to have the lowest error for alanine substitutions with the largest energy changes (the variants shown at the top and bottom of the chart above), making it a reliable predictor of true hot spots.
Alanine Scanning Workflow
The following steps for hot spot scanning and protein design can be completed on virtually any Mac or Windows computer in just a few minutes:
1) Open PDB structure file in Protean 3D
2) Start a hot spot scan, choose from alanine and serine scanning, and specify whether to allow backbone flexibility and/or to repack neighbors.
3) (optional) Model additional variants at positions of interest to test other hypotheses in silico using our “Create Variant” workflow. Figure 4 shows sample results.
4) (optional) Use the hypotheses generated from the Protein Design workflow to guide primer design for PCR site-directed mutagenesis.
Try this workflow with your own data
Request your free trial of Lasergene NOW to try Protean 3D’s protein design workflow for yourself!