Poster Presentation 2019 Hunter Cell Biology Meeting

Comparative analysis of H3K27me3 domains establishes a repressive index for inferring regulatory genes governing cell identity from any chordate cell type (#110)

Enakshi Sinniah 1 , Woo Jun Shim 2 , Jun Xu 1 , Mikael Boden 2 , Nathan Palpant 1 3
  1. Institute for Molecular Bioscience, The University of Queensland, Brisbane, QLD, Australia
  2. School of Chemistry and Molecular Biosciences, The University of Queensland, Brisbane, QLD, Australia
  3. School of Biomedical Sciences, The University of Queensland, Brisbane, QLD, Australia

Identifying the mechanisms governing development and disease remains difficult due to the challenge of enriching for regulatory genes that control cell fate and function. We evaluated chromatin states from 111 NIH epigenome roadmap samples and found that genes having broad H3K27me3 domains with high frequency across diverse cell types, which we termed a repressive tendency (RT), significantly enrich for cell-type specific regulatory genes. We found that a gene’s RT value can act as a fixed variable to weight any quantitative gene expression data resulting in enrichment of regulatory genes governing that given cell-type. This analysis approach, which we call TRIAGE (Transcriptional Regulatory Inference Analysis of Gene Expression), is unsupervised and does not depend on external reference data, statistical cutoffs or prior knowledge. We used consortium data from the Human Cell Atlas, FANTOM, and a draft map of the human proteome to show that TRIAGE can enrich for regulatory genes from any cell or tissue type using any quantitative readout of gene expression including RNA-seq (bulk or single cell), CAGE or quantitative proteomics. Given the highly conserved role of regulatory genes, we show that TRIAGE can be applied to quantitative gene expression data from any chordate species, ranging from tunicates to mammals, and identify the regulatory drivers of disease and development. TRIAGE also significantly outperforms prior analysis approaches used to predict regulatory genes using epigenetic data. Lastly, we utilized TRIAGE to analyze scRNA-seq data from cardiac differentiation and identified SIX3, GAD1 and CRLF1 as candidate novel genes governing germ-layer specification. We used CRISPRi hPSCs to show that loss of function for these genes blocks derivation of definitive endoderm and boosts mesoderm induction. Taken together, TRIAGE provides a computational approach for analyzing any quantitative readout of gene expression to identify regulatory genes underlying cell identity and fate in development and disease from any somatic cell-type and chordate species thus opening new opportunities to discover mechanisms underlying organ development, disease and regeneration.