TY - JOUR
T1 - A close examination of double filtering with fold change and t test in microarray analysis
AU - Zhang, Song
AU - Cao, Jing
N1 - Funding Information:
This work has been supported in part by the U.S. National Institutes of Health UL1 RR024982. The authors thank the reviewers for their constructive comments and suggestions.
PY - 2009/12/8
Y1 - 2009/12/8
N2 - Background: Many researchers use the double filtering procedure with fold change and t test to identify differentially expressed genes, in the hope that the double filtering will provide extra confidence in the results. Due to its simplicity, the double filtering procedure has been popular with applied researchers despite the development of more sophisticated methods. Results: This paper, for the first time to our knowledge, provides theoretical insight on the drawback of the double filtering procedure. We show that fold change assumes all genes to have a common variance while t statistic assumes gene-specific variances. The two statistics are based on contradicting assumptions. Under the assumption that gene variances arise from a mixture of a common variance and gene-specific variances, we develop the theoretically most powerful likelihood ratio test statistic. We further demonstrate that the posterior inference based on a Bayesian mixture model and the widely used significance analysis of microarrays (SAM) statistic are better approximations to the likelihood ratio test than the double filtering procedure. Conclusion: We demonstrate through hypothesis testing theory, simulation studies and real data examples, that well constructed shrinkage testing methods, which can be united under the mixture gene variance assumption, can considerably outperform the double filtering procedure.
AB - Background: Many researchers use the double filtering procedure with fold change and t test to identify differentially expressed genes, in the hope that the double filtering will provide extra confidence in the results. Due to its simplicity, the double filtering procedure has been popular with applied researchers despite the development of more sophisticated methods. Results: This paper, for the first time to our knowledge, provides theoretical insight on the drawback of the double filtering procedure. We show that fold change assumes all genes to have a common variance while t statistic assumes gene-specific variances. The two statistics are based on contradicting assumptions. Under the assumption that gene variances arise from a mixture of a common variance and gene-specific variances, we develop the theoretically most powerful likelihood ratio test statistic. We further demonstrate that the posterior inference based on a Bayesian mixture model and the widely used significance analysis of microarrays (SAM) statistic are better approximations to the likelihood ratio test than the double filtering procedure. Conclusion: We demonstrate through hypothesis testing theory, simulation studies and real data examples, that well constructed shrinkage testing methods, which can be united under the mixture gene variance assumption, can considerably outperform the double filtering procedure.
UR - http://www.scopus.com/inward/record.url?scp=74049104875&partnerID=8YFLogxK
UR - http://www.scopus.com/inward/citedby.url?scp=74049104875&partnerID=8YFLogxK
U2 - 10.1186/1471-2105-10-402
DO - 10.1186/1471-2105-10-402
M3 - Article
C2 - 19995439
AN - SCOPUS:74049104875
SN - 1471-2105
VL - 10
JO - BMC Bioinformatics
JF - BMC Bioinformatics
M1 - 402
ER -