Genome-scale cluster analysis of replicated microarrays using shrinkage correlation coefficient

dc.creatorYao, Jianchaoen
dc.creatorChang, Chunqien
dc.creatorSalmi, Mari L.en
dc.creatorHung, Yeung Samen
dc.creatorLoraine, Annen
dc.creatorRoux, Stanley J.en
dc.descriptionJianchao Yao is with the Institute for Cellular and Molecular Biology and Department of Mathematics, University of Texas at Austin, Austin, Texas 78712, USA, -- Chunqi Chang and Yeung Sam Hung are with the Department of Electrical and Electronic Engineering, University of Hong Kong, Hong Kong, PR China, -- Stanley J. Roux is with the Section of Molecular Cell and Developmental Biology, University of Texas at Austin, Austin, Texas 78712, USA -- Ann Loraine is with the Bioinformatics Research Center, University of North Carolina at Charlotte, Charlotte, NC 28223, USAen
dc.description.abstractBackground: Currently, clustering with some form of correlation coefficient as the gene similarity metric has become a popular method for profiling genomic data. The Pearson correlation coefficient and the standard deviation (SD)-weighted correlation coefficient are the two most widely-used correlations as the similarity metrics in clustering microarray data. However, these two correlations are not optimal for analyzing replicated microarray data generated by most laboratories. An effective correlation coefficient is needed to provide statistically sufficient analysis of replicated microarray data. -- Results: In this study, we describe a novel correlation coefficient, shrinkage correlation coefficient (SCC), that fully exploits the similarity between the replicated microarray experimental samples. The methodology considers both the number of replicates and the variance within each experimental group in clustering expression data, and provides a robust statistical estimation of the error of replicated microarray data. The value of SCC is revealed by its comparison with two other correlation coefficients that are currently the most widely-used (Pearson correlation coefficient and SD-weighted correlation coefficient) using statistical measures on both synthetic expression data as well as real gene expression data from Saccharomyces cerevisiae. Two leading clustering methods, hierarchical and k-means clustering were applied for the comparison. The comparison indicated that using SCC achieves better clustering performance. Applying SCC-based hierarchical clustering to the replicated microarray data obtained from germinating spores of the fern Ceratopteris richardii, we discovered two clusters of genes with shared expression patterns during spore germination. Functional analysis suggested that some of the genetic mechanisms that control germination in such diverse plant lineages as mosses and angiosperms are also conserved among ferns. -- Conclusion: This study shows that SCC is an alternative to the Pearson correlation coefficient and the SD-weighted correlation coefficient, and is particularly useful for clustering replicated microarray data. This computational approach should be generally useful for proteomic data or other high-throughput analysis methodology.en
dc.description.departmentInstitute for Cellular and Molecular Biologyen
dc.identifier.citationYao, Jianchao, Chunqi Chang, Mari L. Salmi, Yeung S. Hung, Ann Loraine, and Stanley J. Roux. “Genome-Scale Cluster Analysis of Replicated Microarrays Using Shrinkage Correlation Coefficient.” BMC Bioinformatics 9, no. 1 (June 18, 2008): 288. doi:10.1186/1471-2105-9-288.en
dc.publisherBMC Bioinformaticsen
dc.rightsAdministrative deposit of works to UT Digital Repository: This works author(s) is or was a University faculty member, student or staff member; this article is already available through open access at The public license is specified as CC-BY: The library makes the deposit as a matter of fair use (for scholarly, educational, and research purposes), and to preserve the work and further secure public access to the works of the University.en
dc.subjectcluster analysisen
dc.subjectreplicated microarraysen
dc.subjectshrinkage correlation coefficienten
dc.titleGenome-scale cluster analysis of replicated microarrays using shrinkage correlation coefficienten

Access full-text files

Original bundle

Now showing 1 - 1 of 1
Thumbnail Image
477.03 KB
Adobe Portable Document Format