Departmental Seminar - Statistical Science
Reconsideration of Double Filtering with Fold Change and T Test in
Microarray Experiments
Jing Cao
February 8, 2008 at 3:00 pm
Abstract
Many researchers use the double filtering procedure with fold change and t test to identify differentially expressed genes, in the hope that the double filtering will provide extra confidence in the testing results. We argue, however, that the extra confidence mainly comes from a much shorter list of identified genes the double filtering procedure produces. We show that the two statistics are based on contradicting assumptions: fold change assumes all genes having a common variance while t statistic assumes gene-specific variance. For a given number of selected genes, the double filtering by fold change and t statistic will lose power, but not necessarily achieve lower false discovery rate. A more realistic assumption could be that gene variances arise from a mixture of homogeneous and heterogeneous variances, which has been implicitly adopted by some existing methods. We propose a likelihood ratio test statistic based on the variance mixture assumption. The new statistic can be constructed based on a Bayesian mixture model. We further combine the Bayesian model with the optimal discovery procedure (ODP). A simulation study and a real microarray data analysis are presented for demonstration.