Splitting Contigs

Note: This topic is not applicable to BAM-based projects.

 

When sequence reads are aligned into contigs, the alignments rarely show 100% identity between all reads at every position. Sequence differences between reads are called conflicts. Conflicts may arise from sequencing errors, from underlying genetic variation, or from misassembling of similar reads from distinct chromosomal locations at a common assembly location.

 

Sequencing errors tend to be sporadic, whereas conflicts arising from genetic variation or misassembled reads typically show recognizable patterns. SeqMan Pro allows you to define and seek such patterns and is capable of separating contigs at such regions of sequence conflicts.

 

If you are using trace data, SeqMan Pro’s Trace Quality Evaluation system can usually generate a better consensus than manual editing or other automated sources. However, you may review and edit conflicts in constituent sequences and gaps in both constituent and consensus sequences.

 

The ability to split contigs in SeqMan Pro was developed for two applications: (i) locating and resolving genetic heterogeneity among reads in the same contig into genetically homogenous groups of reads, and (ii) locating and resolving false joins.

 

Before you elect to split any contig, be aware that this operation cannot be undone. You may want to save your assembly project prior to splitting any contigs, and then save it again under a different name in case you execute splits that you later decide should not have been made.