- Why enrichment analysis?
- What is enrichment analysis?
- Gene ontology and pathways
- GENE ontology and pathways enrichment
- GENOMIC REGIONS enrichment
- Tools and references
People with similar genetic patterns are likely friends
Christakis NA, Fowler JH. "Friendship and natural selection." PNAS 2014 https://www.ncbi.nlm.nih.gov/pubmed/25024208
Enrichment analysis – detection whether a group of objects has certain properties more (or less) frequent than can be expected by chance
Gene set - a priori classification of genes into biologically relevant groups (sets)
An ontology is a formal (hierarchical) representation of concepts and the relationships between them.
The objective of GO is to provide controlled vocabularies of terms for the description of gene products.
These terms are to be used as attributes of gene products, facilitating uniform queries across them.
Gene ontology describes multiple levels of detail of gene function.
https://www.ebi.ac.uk/QuickGO/
Different levels of evidence:
http://software.broadinstitute.org/gsea/msigdb/
https://github.com/stephenturner/msigdf
Self-contained \(H_0\): genes in the gene set do not have any association with the pheontype
Problem: restrictive, use information only from a gene set
Competitive \(H_0\): genes in the gene set have the same level of association with a given phenotype as genes in the complement gene set
Problem: wrong assumption of independent gene sampling
Overrepresentation analysis, Hypergeometric test
Overrepresentation analysis, Hypergeometric test
The expected value of \(k\) would be \(k_e=(n/m)*j\).
If \(k > k_e\), functional category is said to be enriched, with a ratio of enrichment \(r=k/k_e\)
Overrepresentation analysis, Hypergeometric test
Diff. exp. genes | Not Diff. exp. genes | Total | |
---|---|---|---|
In gene set | k | j-k | j |
Not in gene set | n-k | m-n-j+k | m-j |
Total | n | m-n | m |
Overrepresentation analysis, Hypergeometric test
What is the probability of having \(k\) or more genes from the category in the selected \(n\) genes?
\[P = \sum_{i=k}^n{\frac{\binom{m-j}{n-i}\binom{j}{i}}{{m \choose n}}}\]
Overrepresentation analysis, Hypergeometric test
\(k < (n/m)*j\) - underrepresentation. Probability of \(k\) or less genes from the category in the selected \(n\) genes?
\[P = \sum_{i=0}^k{\frac{\binom{m-j}{n-i}\binom{j}{i}}{{m \choose n}}}\]
Overrepresentation analysis (ORA)
stats::fisher.test()
GOstats::hyperGTest()
Example: https://github.com/mdozmorov/MDmisc/blob/master/R/gene_enrichment.R
Problems with Fisher's exact test
The outcome of the overrepresentation test depends on the significance threshold used to declare genes differentially expressed.
Functional categories in which many genes exhibit small changes may go undetected.
Genes are not independent, so a key assumption of the Fisher’s exact tests is violated.
Functional Class Scoring (FCS)
Gene set analysis (GSA). Mootha et al., 2003; modified by Subramanian, et al. "Gene set enrichment analysis: A knowledge-based approach for interpreting genome-wide expression profiles." PNAS 2005 http://www.pnas.org/content/102/43/15545.abstract
Main rationale – functionally related genes often display a coordinated expression to accomplish their roles in the cells
Aims to identify gene sets with "subtle but coordinated" expression changes that would be missed by DEGs threshold selection
Enrichment Score
\[X_{Ri}=-\sqrt{\frac{G}{N-G}}\]
\[X_{Ri}=\sqrt{\frac{N-G}{G}}\]
Enrichment Score
\[\max_{1 \le j \le N} \sum_{i=1}^j{X_{Ri}}\]
Linear model-based
Linear model-based
Impact analysis - incorporates topology of the pathway.
Sorin Draghici et al., “A Systems Biology Approach for Pathway Level Analysis,” Genome Research. 2007. https://www.ncbi.nlm.nih.gov/pubmed/17785539
Adi Laurentiu Tarca et al., “A Novel Signaling Pathway Impact Analysis,” Bioinformatics. 2009
Each genomic region has coordinates (unique IDs):
Chromosome
, Start
, End
Epigenomic (regulatory) regions - genomic regions annotated as carrying functional and/or regulatory potential
…
Enrichment = functional impact
goana
, camera
, roast
, romer
FINE