Graph pathway enrichment results — GRAPH_PW_ENRICH • MotrpacRatTraining6moData

Pathway enrichment results for graphical clusters (nodes, edges, and paths) of interest

Usage

GRAPH_PW_ENRICH

Format

A data frame with 156906 rows and 22 variables:

query: character, not used, carried over from gprofiler2::gost() output
significant: logical, not used, carried over from gprofiler2::gost() output
term_size: double, effective pathway size from gprofiler2::gost() output
query_size: integer, size of input, i.e. list of Ensembl genes associated with differential features
intersection_size: double, size of the intersection between the input and the pathway members
precision: double, the proportion of genes in the input list that are annotated to the function (defined as intersection_size/query_size)
recall: double, the proportion of functionally annotated genes that the query recovers (defined as intersection_size/term_size)
term_id: character, pathway term ID
source: character, database corresponding to the pathway, one of: "KEGG", "REAC"
term_name: character, pathway name
effective_domain_size: integer, size of the custom background Ensembl gene set
source_order: integer, not used, carried over from gprofiler2::gost() output
parents: list, pathway parent(s)
evidence_codes: character, not used, carried over from gprofiler2::gost() output
intersection: character, intersection between input and pathway (Ensembl IDs). NA for metabolomics enrichments
gost_adj_p_value: double, BH-adjusted p-value returned by gprofiler2::gost(), ignored because p-values are only adjusted within each tissue/ome/cluster combination. Use the adj_p_value column instead.
computed_p_value: double, nominal hypergeometric p-value, computed from the gprofiler2::gost() output
cluster: character, graphical cluster (node, edge, or path) name
tissue: character, tissue abbreviation, one of TISSUE_ABBREV. Note that VENACV, OVARY, TESTES, were not included in the graphical representation of differential features due to missing groups (e.g., females trained for 1 week).
ome: character, assay abbreviation, one of ASSAY_ABBREV
kegg_id: character, pathway ID returned from FELLA::enrich()
adj_p_value: double, IHW FDR, calculated using IHW::ihw() with tissue as a covariate
graphical_cluster: character, cluster column with tissue prefix removed

Details

All non-metabolite training-regulated features (5% FDR) were mapped to Ensembl gene symbols using FEATURE_TO_GENE. Training-regulated metabolites were mapped to KEGG IDs. For each graphical cluster of interest (i.e., the ten largest paths, two largest nodes, and two largest single edges with at least 20 features in each tissue, as well as all 8-week nodes), we performed pathway enrichment analysis separately for the Ensembl genes (or KEGG IDs for metabolites) associated with differential features in each ome.

For gene-centric omes (i.e., all but metabolomics) we performed enrichment analysis of KEGG and REACTOME rat pathways (organism "rnorvegicus") using gprofiler2::gost() with custom backgrounds defined by GENE_UNIVERSES. Only pathways with at least 10 and up to 200 members were tested. Because gprofiler2::gost() only returns adjusted p-values, we recalculated nominal p-values using a one-tailed hypergeometric test, which is consistent with how gprofiler2::gost() calculates enrichments. See MotrpacRatTraining6mo::cluster_pathway_enrichment() for implementation.

For metabolites, we performed enrichment of KEGG pathways using the hypergeometric method in FELLA::enrich() with custom backgrounds defined by GENE_UNIVERSES. See MotrpacRatTraining6mo::run_fella() for implementation.

Pathway enrichment analysis p-values were adjusted across all results using Independent Hypothesis Weighting (IHW) with tissue as a covariate.