Pathway enrichment results for graphical clusters (nodes, edges, and paths) of interest
Format
A data frame with 156906 rows and 22 variables:
querycharacter, not used, carried over from
gprofiler2::gost()outputsignificantlogical, not used, carried over from
gprofiler2::gost()outputterm_sizedouble, effective pathway size from
gprofiler2::gost()outputquery_sizeinteger, size of input, i.e. list of Ensembl genes associated with differential features
intersection_sizedouble, size of the intersection between the input and the pathway members
precisiondouble, the proportion of genes in the input list that are annotated to the function (defined as
intersection_size/query_size)recalldouble, the proportion of functionally annotated genes that the query recovers (defined as
intersection_size/term_size)term_idcharacter, pathway term ID
sourcecharacter, database corresponding to the pathway, one of: "KEGG", "REAC"
term_namecharacter, pathway name
effective_domain_sizeinteger, size of the custom background Ensembl gene set
source_orderinteger, not used, carried over from
gprofiler2::gost()outputparentslist, pathway parent(s)
evidence_codescharacter, not used, carried over from
gprofiler2::gost()outputintersectioncharacter, intersection between input and pathway (Ensembl IDs). NA for metabolomics enrichments
gost_adj_p_valuedouble, BH-adjusted p-value returned by
gprofiler2::gost(), ignored because p-values are only adjusted within each tissue/ome/cluster combination. Use theadj_p_valuecolumn instead.computed_p_valuedouble, nominal hypergeometric p-value, computed from the
gprofiler2::gost()outputclustercharacter, graphical cluster (node, edge, or path) name
tissuecharacter, tissue abbreviation, one of TISSUE_ABBREV. Note that VENACV, OVARY, TESTES, were not included in the graphical representation of differential features due to missing groups (e.g., females trained for 1 week).
omecharacter, assay abbreviation, one of ASSAY_ABBREV
kegg_idcharacter, pathway ID returned from
FELLA::enrich()adj_p_valuedouble, IHW FDR, calculated using
IHW::ihw()withtissueas a covariategraphical_clustercharacter,
clustercolumn with tissue prefix removed
Details
All non-metabolite training-regulated features (5% FDR) were mapped to Ensembl gene symbols using FEATURE_TO_GENE. Training-regulated metabolites were mapped to KEGG IDs. For each graphical cluster of interest (i.e., the ten largest paths, two largest nodes, and two largest single edges with at least 20 features in each tissue, as well as all 8-week nodes), we performed pathway enrichment analysis separately for the Ensembl genes (or KEGG IDs for metabolites) associated with differential features in each ome.
For gene-centric omes (i.e., all but metabolomics)
we performed enrichment analysis of KEGG and REACTOME rat pathways (organism "rnorvegicus")
using gprofiler2::gost() with custom backgrounds defined by GENE_UNIVERSES.
Only pathways with at least 10 and up to 200 members were tested. Because gprofiler2::gost()
only returns adjusted p-values, we recalculated nominal p-values using a one-tailed hypergeometric test,
which is consistent with how gprofiler2::gost() calculates enrichments.
See MotrpacRatTraining6mo::cluster_pathway_enrichment() for implementation.
For metabolites,
we performed enrichment of KEGG pathways using the hypergeometric method in FELLA::enrich()
with custom backgrounds defined by GENE_UNIVERSES. See MotrpacRatTraining6mo::run_fella() for implementation.
Pathway enrichment analysis p-values were adjusted across all results using Independent Hypothesis Weighting (IHW) with tissue as a covariate.