MotrpacHumanPreSuspensionAnalysis: Differential Analysis • MotrpacHumanPreSuspensionAnalysis

library(MotrpacHumanPreSuspensionAnalysis)
#> Warning: no DISPLAY variable so Tk is not available

Overview

This vignette demonstrates how to use load_differential_analysis() to access differential analysis (DA) results generated by the MoTrPAC Human Pre-COVID sedentary adult study. The function provides a unified interface for loading curated summary statistics across multiple molecular assays (“omes”) and tissues, with consistent column structure and metadata harmonization.

The package is designed to distribute summary-level results only. Subject-level data are not included and must be requested separately through MoTrPAC data access procedures.

Basic Usage

DA_list <- load_differential_analysis()
#> Please remember that the lowest CV Metabolite is chosen and the
#>             relevant refmet name is used. If you're not able to find your desired
#>             metabolite, look through the METABOLOMICS_CV object for the relevant
#>             refmet/feature name.

This loads all available tissues and omes using lazily loaded package data. Results are returned as a nested list, organized by tissue and assay.

To inspect a single result table:

str(DA_list[["adipose"]][["prot-pr"]])
#> Classes 'data.table' and 'data.frame':   63504 obs. of  20 variables:
#>  $ tissue            : chr  "adipose" "adipose" "adipose" "adipose" ...
#>  $ assay             : chr  "prot-pr" "prot-pr" "prot-pr" "prot-pr" ...
#>  $ full_model        : Factor w/ 1 level "~ 0 + group_timepoint +  BMI + calculatedAge + Sex + (1 | pid)": 1 1 1 1 1 1 1 1 1 1 ...
#>  $ contrast          : Factor w/ 9 levels "group_timepointADUEndur.post_3.5_4_hr - group_timepointADUEndur.pre_exercise - group_timepointADUControl.post_3"| __truncated__,..: 1 1 1 1 1 1 1 1 1 1 ...
#>  $ contrast_short    : Factor w/ 33 levels "Endur.during_20_min - Control.during_20_min (delta-delta)",..: 5 5 5 5 5 5 5 5 5 5 ...
#>  $ contrast_type     : Factor w/ 5 levels "exercise_with_controls",..: 1 1 1 1 1 1 1 1 1 1 ...
#>  $ contrast_category : Factor w/ 6 levels "EE-CON","RE-CON",..: 1 1 1 1 1 1 1 1 1 1 ...
#>  $ randomGroupCode   : chr  "ADUEndur" "ADUEndur" "ADUEndur" "ADUEndur" ...
#>  $ Timepoint         : Factor w/ 7 levels "pre_exercise",..: 6 6 6 6 6 6 6 6 6 6 ...
#>  $ feature_id        : chr  "O00287" "Q8TE02" "Q8NHG8" "P10912" ...
#>  $ logFC             : num  0.74 0.578 0.609 -0.857 -0.71 ...
#>  $ CI.L              : num  0.533 0.414 0.463 -1.087 -0.933 ...
#>  $ CI.R              : num  0.946 0.742 0.756 -0.627 -0.487 ...
#>  $ degrees_of_freedom: num  30.3 19.5 21.2 19.4 12.7 ...
#>  $ logLik            : num  -15.5 -20.1 -40.5 -32.2 -32.7 ...
#>  $ t                 : num  7.31 7.21 8.75 -7.73 -6.52 ...
#>  $ AveExpr           : num  0.0188 -1.1554 -0.2189 -0.205 -0.5905 ...
#>  $ z.std             : num  5.53 5.38 5.35 -5.3 -5.04 ...
#>  $ p_value           : num  3.22e-08 7.53e-08 8.96e-08 1.14e-07 4.62e-07 ...
#>  $ adj_p_value       : num  0.000202 0.000202 0.000202 0.000202 0.000651 ...
#>  - attr(*, ".internal.selfref")=<externalptr> 
#>  - attr(*, "sorted")= chr [1:4] "full_model" "contrast" "p_value" "feature_id"

Selecting Tissues and Omics Modalities

Users can restrict loading to specific tissues and omes to reduce memory usage and improve clarity.

DA_list <- load_differential_analysis(
  selected_tissues = c("muscle"),
  selected_omes = c("transcript-rna-seq", "prot-pr")
)

Internally, metabolomics platforms are handled with special care: if any metabolomics assay is requested, all metabolomics results are loaded to ensure consistent platform-level filtering.

Note: For metabolomics assays, multiple features may map to the same reference metabolite. To reduce redundancy:

The lowest coefficient-of-variation (CV) feature is selected.

Feature identifiers are replaced with standardized RefMet names.

P-values are recalculated within contrasts after filtering.

Users should consult the METABOLOMICS_CVS object if a metabolite of interest appears to be missing.

If you forget which syntax is used for omes & tissues:

ome_available_list()
#>  [1] "prot-ol"              "prot-ph"              "prot-pr"             
#>  [4] "transcript-rna-seq"   "epigen-methylcap-seq" "epigen-atac-seq"     
#>  [7] "metab-u-hilicpos"     "metab-u-ionpneg"      "metab-u-lrpneg"      
#> [10] "metab-u-lrppos"       "metab-u-rpneg"        "metab-u-rppos"       
#> [13] "metab-t-amines"       "metab-t-conv"         "metab-t-imm-crt"     
#> [16] "metab-t-oxylipneg"    "metab-t-tca"          "metab-t-nuc"         
#> [19] "metab-t-acoa"         "metab-t-ka"           "metab-meta-reg"
tissue_available_list()
#> The available tissues are placed into overarching categories. For example, an assay using PBMCs would be categorized as blood.
#> [1] "adipose" "blood"   "muscle"

Epigenomics Data (ATAC-seq and MethylCap-seq)

Epigenomics results are not stored directly in the package due to file size constraints. When epigen = TRUE, results are downloaded from the MoTrPAC AWS location, which may take a lot longer than other data objects.

DA_list <- load_differential_analysis(
  epigen = TRUE
)

Important points to note:

Epigenomics downloads can take 30 minutes or longer!

Splicing Data

A data object containing results from a differential alternative splicing analysis. The analytical workflow used to generate these results—including read alignment, isoform quantification, splicing event detection, and statistical testing—is described in detail in: “Characterization of exercise-modulated alternative splicing landscape in human skeletal muscle, adipose tissue, and blood in the MoTrPAC Study”. Due to file size limitations, only significant (FDR<0.05) results are included as a data object.

names(SPLICING_DA)
#> [1] "adipose" "blood"   "muscle"

Including the results as a larger matrix instead

Since the differential abundance analysis largely all comes in one format, you can choose to display the results in one big matrix format instead, if you find that to be easier to work with.

DA_matrix <- load_differential_analysis(
  single_matrix = TRUE
)
#> Please remember that the lowest CV Metabolite is chosen and the
#>             relevant refmet name is used. If you're not able to find your desired
#>             metabolite, look through the METABOLOMICS_CV object for the relevant
#>             refmet/feature name.

Feature-to-Gene Annotation

Setting combine_with_featgene = TRUE merges DA results with the HUMAN_FEATURE_TO_GENE mapping table, adding gene-level annotations where applicable.

DA_list <- load_differential_analysis(
  combine_with_featgene = TRUE
)
#> Please remember that the lowest CV Metabolite is chosen and the
#>             relevant refmet name is used. If you're not able to find your desired
#>             metabolite, look through the METABOLOMICS_CV object for the relevant
#>             refmet/feature name.

This is particularly useful for transcriptomic and proteomic assays when integrating results across omes.