Multiple Testing Corrections

Statistical tests like the Fisher’s Exact Test Signal Search, Student’s t-Test, F-Test (ANOVA) and Moderated t-Test are used to identify differentially expressed genes. However, often with a large dataset, it’s possible to have a significant group of false positives.

 

For example, a t-Test can be applied on a group of genes and those which have a P-value less than a certain value (0.05, for example) can be chosen as differentially expressed. However, when the test is performed on a large number of genes (order of 10,000), a significant number of genes (~500) that are not actually differentially expressed will have a P-value lower than the set threshold and thus will be selected as differentially expressed. These genes are false positives, and this issue is referred to as the Multiple Testing problem.

 

Various adjustments can be made to the P-values with the objective of reducing the number of false positives. The adjustments available in ArrayStar are listed below and can be applied to the P-values for any of the probabilistic statistical tests in ArrayStar.