Genome-Scale Cluster Analysis of Replicated Microarrays Using Shrinkage Correlation Coefficient

dc.contributor.utaustinauthorSalmi, Mari L.en_US
dc.contributor.utaustinauthorRoux, Stanley J.en_US
dc.creatorYao, Jianchaoen_US
dc.creatorChang, Chunqien_US
dc.creatorSalmi, Mari L.en_US
dc.creatorHung, Yeung S.en_US
dc.creatorLoraine, Annen_US
dc.creatorRoux, Stanley J.en_US
dc.date.accessioned2016-10-28T19:50:32Z
dc.date.available2016-10-28T19:50:32Z
dc.date.issued2008-06en_US
dc.description.abstractCurrently, clustering with some form of correlation coefficient as the gene similarity metric has become a popular method for profiling genomic data. The Pearson correlation coefficient and the standard deviation (SD)-weighted correlation coefficient are the two most widely-used correlations as the similarity metrics in clustering microarray data. However, these two correlations are not optimal for analyzing replicated microarray data generated by most laboratories. An effective correlation coefficient is needed to provide statistically sufficient analysis of replicated microarray data. Results: In this study, we describe a novel correlation coefficient, shrinkage correlation coefficient (SCC), that fully exploits the similarity between the replicated microarray experimental samples. The methodology considers both the number of replicates and the variance within each experimental group in clustering expression data, and provides a robust statistical estimation of the error of replicated microarray data. The value of SCC is revealed by its comparison with two other correlation coefficients that are currently the most widely-used (Pearson correlation coefficient and SD-weighted correlation coefficient) using statistical measures on both synthetic expression data as well as real gene expression data from Saccharomyces cerevisiae. Two leading clustering methods, hierarchical and k-means clustering were applied for the comparison. The comparison indicated that using SCC achieves better clustering performance. Applying SCC-based hierarchical clustering to the replicated microarray data obtained from germinating spores of the fern Ceratopteris richardii, we discovered two clusters of genes with shared expression patterns during spore germination. Functional analysis suggested that some of the genetic mechanisms that control germination in such diverse plant lineages as mosses and angiosperms are also conserved among ferns. Conclusion: This study shows that SCC is an alternative to the Pearson correlation coefficient and the SD-weighted correlation coefficient, and is particularly useful for clustering replicated microarray data. This computational approach should be generally useful for proteomic data or other high-throughput analysis methodology.en_US
dc.description.departmentCellular and Molecular Biologyen_US
dc.description.sponsorshipNASA NAG2-1586, NAG 10-295en_US
dc.description.sponsorshipCRCG of the University of Hong Kongen_US
dc.identifierdoi:10.15781/T2222R84R
dc.identifier.citationYao, Jianchao, Chunqi Chang, Mari L. Salmi, Yeung S. Hung, Ann Loraine, and Stanley J. Roux. "Genome-scale cluster analysis of replicated microarrays using shrinkage correlation coefficient." BMC bioinformatics, Vol. 9, No. 1 (Jun., 2008): 288.en_US
dc.identifier.doi10.1186/1471-2105-9-288en_US
dc.identifier.issn1471-2105en_US
dc.identifier.urihttp://hdl.handle.net/2152/43199
dc.language.isoEnglishen_US
dc.relation.ispartofen_US
dc.relation.ispartofserialBMC Bioinformaticsen_US
dc.rightsAdministrative deposit of works to Texas ScholarWorks: This works author(s) is or was a University faculty member, student or staff member; this article is already available through open access or the publisher allows a PDF version of the article to be freely posted online. The library makes the deposit as a matter of fair use (for scholarly, educational, and research purposes), and to preserve the work and further secure public access to the works of the University.en_US
dc.rights.restrictionOpenen_US
dc.subjectgene-expression profilesen_US
dc.subjectabscisic-aciden_US
dc.subjectceratopteris-richardiien_US
dc.subjectmixtureen_US
dc.subjectmodelen_US
dc.subjecttyrosine dephosphorylationen_US
dc.subjectgerminating sporesen_US
dc.subjectseed dormancyen_US
dc.subjectpatternsen_US
dc.subjectarabidopsisen_US
dc.subjectproteinen_US
dc.subjectbiochemical research methodsen_US
dc.subjectbiotechnology & applied microbiologyen_US
dc.subjectmathematical & computational biologyen_US
dc.titleGenome-Scale Cluster Analysis of Replicated Microarrays Using Shrinkage Correlation Coefficienten_US
dc.typeArticleen_US

Access full-text files

Original bundle

Now showing 1 - 1 of 1
Loading...
Thumbnail Image
Name:
2008_06_Yao.pdf
Size:
1.66 MB
Format:
Adobe Portable Document Format

License bundle

Now showing 1 - 1 of 1
No Thumbnail Available
Name:
license.txt
Size:
1.65 KB
Format:
Plain Text
Description: