This first section of Part B will show you how to evaluate one of the contigs assembled by SeqMan NGen.
- If you came to this topic directly from Part A, the file De novo assembly.sqd will already be open in SeqMan Pro. Use File > Save and save the project under the default name. Otherwise, launch SeqMan Pro and use File > Open to open it from the location in which it was saved.
- Because this was a de novo assembly, the Project Summary window contains multiple contigs.
- Click once on Contig 5 to select it, and then select Contig > Strategy View.
- Arrange and resize the Strategy View until it occupies the upper half and the full width of the SeqMan Pro window.
- Repeatedly press the Zoom In tool ( ) until you see only the first 1,000 or so bases.
- Look at the Coverage Threshold graph, which is located just below the ruler. The color and thickness of the band represent the coverage within the assembly as it compares to the threshold parameters defined under Project > Parameters > Strategy Viewing & Coverage.
Hover over the graph to reveal a legend.
The thick green bar shows that the majority of this section has excellent coverage on both strands. However, the area between 1-300 bases was only sequenced in one direction, as indicated by the thin cyan (blue) bar.
The Depth of Coverage graph shows similar information, but displays coverage by the number of reads at each location.
- Double-click near the beginning of the coverage graph. This opens the Alignment View at the same location. Scroll all the way to the left of the Alignment View. Then, to enable simultaneous viewing of both views, resize and relocate the Alignment View so it takes up the lower half of the SeqMan Pro window.
According to the Strategy View graphs, the area at the beginning on Contig 5 should have single-direction coverage. This is confirmed by the Alignment view, where all of the arrows are right-facing.
- In the Strategy View, double-click near “800,” which is shown as being in a high-coverage area. Note that the Alignment View automatically updates so that it is centered at the same base. In the Alignment View, note that the coverage here is much deeper: deep enough to require a vertical scroll-bar. In addition, the red and green arrows pointing in opposite directions signify that the coverage is in both directions.
- To display agreement and conflict between fragment sequences, check the Conflicts box at the top left of the Strategy View.
The conflict score, plotted on the histogram as a black bar, is thin throughout most of its length. This indicates little or no conflict in most areas.
- To check pair consistency, look at the Pair Consistency graph in the Strategy View. (The image below was expanded by dragging down the pane divider between the upper and lower sections of the Strategy View.) The thick green portion above the baseline signifies that the majority of pairs are consistent with the current assembly. The small orange bar below the baseline denotes a region where pairs are inconsistent with respect to assembly location or orientation. Note the different scales for consistent (positive axis) and inconsistent (negative axis) pairs.
The lower half of the Strategy View displays arrows representing the individual sequences in the contig. For detailed descriptions of these arrows, see Paired End Arrow Colors in the SeqMan Pro online help.
- To limit the display to certain types of arrows (in this example, “consistent pairs”), click and hold down the mouse button on the Show All Reads tool ( ), and then select the “key” tool.
In the ensuing dialog, first select Custom to clear all the selections, then check the Consistent single contig box to the left of the paired green arrows.
Close the dialog by clicking the ‘x’ or button in the top corner.
- Scroll down the Strategy View, observing the excellent depth of coverage by consistent pairs.
- To view pair consistency information in text format, open the Statistics Report using the Project > Statistics command. It will take approximately 1-2 minutes for statistics to be calculated.
The Statistics Report provides information for all contigs, including the number of top and bottom-strand constituent sequences, the average sequence depth, and the contig length.
For Contig 5, there are approximately equal numbers of sequence on the top and bottom strands (7834 vs. 7693), and a total of 2 gaps in a 29,653 bp sequence.
Note that the Scaffold number for Contig 5 is “0,” because scaffolds have not yet been set up. This will be addressed in section 4. Ordering contigs into scaffolds.
- Close all windows except for the Project Summary window (“De novo assembly.sqd”).
Need more help with this?