Testing The Hourglass Model Of Vertebrate Development: Methods For Developmental Transcriptomic Meta-Analyses
MetadataShow full item record
Evolutionary variation is responsible for a broad diversity among organisms and driven mechanistically by conserved developmental processes. The hourglass model of development posits that all organisms sharing a lineage (phylum) undergo a period of similarity in during an intermediate period of embryonic development, with increased divergence at the beginning and end. Quantitative assessment of this hypothesis is now feasible with recent technological advances in gene expression profiling and the widespread availability of gene expression profiling data in public repositories. However, no standards for best-practices have been established to guide meta-analysis of long time-series transcriptomic data across taxa. We are conducting a meta-analysis of gene expression profile studies in five vertebrate species (Danio rerio, Gallus gallus, Mus musculus, Xenopus laevis, and Xenopus tropicalis), from the beginning to the end of development, to test the existence of an hourglass pattern and probe its molecular mechanisms. In order to achieve this, we identified and addressed three principal challenges – batch effects, multiple profiling platforms, and broad sampling/low sample size. First, the collected data were manually curated and tested for sources of variation to minimize batch effects caused by differing methodologies. Next, a pipeline of data transformations was devised to integrate data from microarray (MA) and RNAseq (RS) profiling techniques in X. tropicalis. Then, four naïve descriptive metrics (CV, mean FC, max FC, and max FC/%T) were evaluated as selectors for genes or orthologous gene groups (OGGs) showing important temporal expression patterns. These metrics were then evaluated for use in selecting important developmental genes/OGGs to reduce the ratio of factors to samples. PCA results indicated that curation successfully resulted in a meta-analysis with no detectable batch effects. The MA-RS integration pipeline, on the other hand, showed poor effectiveness in eliminating batch effects from profiling platform and limited translation to other genera. Expression importance metrics appear roughly equal in their discrimination of important patterns. Preliminary data show evidence of an hourglass pattern of gene expression, and importance metrics are being applied in tandem to study the role of developmentally important genes in generating this pattern.