DESeq2 or edgeR statistics for an assembly can be analyzed by opening the assembly in ArrayStar. For information about setting up an assembly suitable for analyzing DESeq2 or edgeR statistics in ArrayStar, see Create an assembly using DESeq2 or edgeR statistics.
Both methods require a control group to be specified, and both require replicate samples for each experimental condition and for the control. Note that when multiple experimental conditions are being considered, the same control group is used for multiple tests. The original P-values from the statistical tests are then adjusted using the Benjamini-Hochberg (1995) procedure.
Differences between DESeq2 and edgeR are shown in the table below:
|Normalization method|| Uses a median of ratios method to normalize read counts to account for sequencing depth and RNA composition. Provides two methods: regularized logarithm (rlog) and Variance Stabilizing Transformations (VST).
DESeq2 does not attempt to account for transcript length since it is comparing counts between samples for the same gene and assumes the length does not change. This assumption holds true except in rare cases where the dominant transcript length changes between samples due to alternative splicing for example.
| Uses "trimmed mean of M-values" (TMM) (Robinson & Oshlack, 2010|topic=Research References). The TMM normalized read count can be viewed in the ArrayStar tables, where counts are represented as log2(counts-per-million-reads).
Normalized counts generated by a different method, RLE, are also available within ArrayStar but these values are not used for the actual statistical tests. RLE is similar to the RLOG normalization method used by DESEq2.
|Statistical tests for differential expression||DESeq2 uses raw counts, rather than normalized count data, and models the normalization to fit the counts within a Generalized Linear Model (GLM) of the negative binomial family with a logarithmic link. Statistical tests are then performed to assess differential expression, if any.|| Data are normalized to account for sample size differences and variance among samples. The normalized count data are used to estimate per-gene fold changes and to perform statistical tests of whether each gene is likely to be differentially expressed.
EdgeR uses an exact test under a negative binomial distribution (Robinson and Smyth, 2008|topic=Research References). The statistical test is related to Fisher’s exact test, though Fisher uses a different distribution.
|Data reporting method||In ArrayStar, the rlog values are used by default in the scatter plot and for clustering. VST values are displayed as Gene Table data columns.||In ArrayStar, the log2(CPM) values calculated using TMM are used by default in the scatter plot. In the Gene Table, values for fold change compared to the control are represented as log(fold change).|
Need more help with this?