To display the Consensus Match:
To display or hide the Consensus Match track in the Sequences view footer, perform an alignment and then check or uncheck the Consensus Match box in the Tracks panel. This box is only visible after performing an alignment.
How Consensus Match is calculated:
The Consensus Match histogram is calculated by dividing the total score for the called consensus character by the number of sequences at the position. As agreement increases, the bar height increases and appears in a lighter shade of green. The histogram is not calculated for positions where there are consensus gaps. Hover over any bar in the histogram to display an information balloon showing the consensus match percentage, as demonstrated in the image below.
The following procedure is used to calculate the consensus sequence and the match percent for a given histogram column:
- Score the characters in the column - Each occurrence of a non-ambiguous character in the column is scored as “1” for that character. If the character is an ambiguity code, a fractional count is added to each of the relevant characters. For example, in a nucleotide sequence, the character R would add “1/2” to the counts for A and G. Likewise, B would add “1/3” to the counts for C, G, and T.
- Use the counts to determine the consensus character - If there is a single un-ambiguous character that is the most frequent, the consensus is called for that character. If there are two or more characters whose scores are tied for the maximum, the consensus is called as “X” or “n,” for proteins or nucleic acids, respectively. If there is a two-way tie, and one of the characters is a gap, the non-gap character will be called.
- Determine the case of the consensus character - If the count for the called character is > 50% of the total count for the column, upper case is used. Otherwise, lower case is used.
- Calculate the consensus match for the histogram; The histogram calculation uses the “maximum count” value, which will either be the count of the single predominant character (the unambiguous consensus) or will be the “tied” value that led to the calling of an ambiguity code. The final consensus match is the total score for the called character divided by the number of sequences at the position.
Characters in the DNA alignment column
|Consensus match calculation
|Histogram bar size (as % of total available height)
|A C G Y
|A A C C G T
To change Consensus Match options:
To learn how to access the options section for this track, see Tracks.
- By default, the Consensus Match histogram is shown in tones of green. To choose another color, click on the color box to the right of Color.
Click if you wish to return to the default values.
Need more help with this?