Pathway enrichment results for graphical clusters (nodes, edges, and paths) of interest
Format
A data frame with 156906 rows and 22 variables:
query
character, not used, carried over from
gprofiler2::gost()
outputsignificant
logical, not used, carried over from
gprofiler2::gost()
outputterm_size
double, effective pathway size from
gprofiler2::gost()
outputquery_size
integer, size of input, i.e. list of Ensembl genes associated with differential features
intersection_size
double, size of the intersection between the input and the pathway members
precision
double, the proportion of genes in the input list that are annotated to the function (defined as
intersection_size/query_size
)recall
double, the proportion of functionally annotated genes that the query recovers (defined as
intersection_size/term_size
)term_id
character, pathway term ID
source
character, database corresponding to the pathway, one of: "KEGG", "REAC"
term_name
character, pathway name
effective_domain_size
integer, size of the custom background Ensembl gene set
source_order
integer, not used, carried over from
gprofiler2::gost()
outputparents
list, pathway parent(s)
evidence_codes
character, not used, carried over from
gprofiler2::gost()
outputintersection
character, intersection between input and pathway (Ensembl IDs). NA for metabolomics enrichments
gost_adj_p_value
double, BH-adjusted p-value returned by
gprofiler2::gost()
, ignored because p-values are only adjusted within each tissue/ome/cluster combination. Use theadj_p_value
column instead.computed_p_value
double, nominal hypergeometric p-value, computed from the
gprofiler2::gost()
outputcluster
character, graphical cluster (node, edge, or path) name
tissue
character, tissue abbreviation, one of TISSUE_ABBREV. Note that VENACV, OVARY, TESTES, were not included in the graphical representation of differential features due to missing groups (e.g., females trained for 1 week).
ome
character, assay abbreviation, one of ASSAY_ABBREV
kegg_id
character, pathway ID returned from
FELLA::enrich()
adj_p_value
double, IHW FDR, calculated using
IHW::ihw()
withtissue
as a covariategraphical_cluster
character,
cluster
column with tissue prefix removed
Details
All non-metabolite training-regulated features (5% FDR) were mapped to Ensembl gene symbols using FEATURE_TO_GENE. Training-regulated metabolites were mapped to KEGG IDs. For each graphical cluster of interest (i.e., the ten largest paths, two largest nodes, and two largest single edges with at least 20 features in each tissue, as well as all 8-week nodes), we performed pathway enrichment analysis separately for the Ensembl genes (or KEGG IDs for metabolites) associated with differential features in each ome.
For gene-centric omes (i.e., all but metabolomics)
we performed enrichment analysis of KEGG and REACTOME rat pathways (organism "rnorvegicus")
using gprofiler2::gost()
with custom backgrounds defined by GENE_UNIVERSES.
Only pathways with at least 10 and up to 200 members were tested. Because gprofiler2::gost()
only returns adjusted p-values, we recalculated nominal p-values using a one-tailed hypergeometric test,
which is consistent with how gprofiler2::gost()
calculates enrichments.
See MotrpacRatTraining6mo::cluster_pathway_enrichment()
for implementation.
For metabolites,
we performed enrichment of KEGG pathways using the hypergeometric method in FELLA::enrich()
with custom backgrounds defined by GENE_UNIVERSES. See MotrpacRatTraining6mo::run_fella()
for implementation.
Pathway enrichment analysis p-values were adjusted across all results using Independent Hypothesis Weighting (IHW) with tissue as a covariate.