Timewise and training differential analysis wrapper for RRBS data.
Usage
rrbs_differential_analysis(
y,
PHENO = MotrpacRatTraining6moData::PHENO,
METHYL_META = MotrpacRatTraining6moData::METHYL_META,
verbose = TRUE,
adj_pct_unaligned = FALSE,
samples_to_remove = data.table::data.table(MotrpacRatTraining6moData::OUTLIERS)[assay
== "METHYL", viallabel],
edger_tol = 1e-05,
dataset_name = ""
)
Arguments
- y
A
edgeR::DGEList()
object. yall$genes is a metadata data frame with the locus coordinates (see details), and these fields at minimum: Chr, EntrezID, Symbol, and Strand- PHENO
A data frame with a row per sample. Contains at least the following columns: "sex", "group". MotrpacRatTraining6moData::PHENO by default.
- METHYL_META
A data frame with a row per sample. Contains the RRBS pipeline QA/QC scores. Contains at least the following columns: "pct_Unaligned". MotrpacRatTraining6moData::METHYL_META by default.
- verbose
A logical. TRUE: comments about the pipeline progress are printed.
- adj_pct_unaligned
A logical. TRUE: adjust for percent unaligned reads. Default is FALSE.
- samples_to_remove
A character vector. Contains the ids of the samples that should be removed (e.g., identified outliers or failed samples). METHYL samples in MotrpacRatTraining6moData::OUTLIERS by default.
- edger_tol
A number. An internal parameter of edgeR. Default is 1e-05. Consider increasing if the algorithm takes too long.
- dataset_name
A character. The name of the current dataset. Will be added to the output.
Value
A list.First item is called timewise and will contain the contrast-specific differential analysis results. Second item is called training and contains the overall training-level significance per locus.
Examples
if (FALSE) { # \dontrun{
data(PHENO)
data(METHYL_META)
# Raw data by tissue are available as RData files in Google Cloud Storage with the following URLs:
# Brown adipose:
# https://storage.googleapis.com/motrpac-rat-training-6mo-extdata/METHYL_BAT_RAW_DATA.rda
# Heart:
# https://storage.googleapis.com/motrpac-rat-training-6mo-extdata/METHYL_HEART_RAW_DATA.rda
# Hippocampus:
# https://storage.googleapis.com/motrpac-rat-training-6mo-extdata/METHYL_HIPPOC_RAW_DATA.rda
# Kidney:
# https://storage.googleapis.com/motrpac-rat-training-6mo-extdata/METHYL_KIDNEY_RAW_DATA.rda
# Lung:
# https://storage.googleapis.com/motrpac-rat-training-6mo-extdata/METHYL_LUNG_RAW_DATA.rda
# Liver:
# https://storage.googleapis.com/motrpac-rat-training-6mo-extdata/METHYL_LIVER_RAW_DATA.rda
# Gastrocnemius:
# https://storage.googleapis.com/motrpac-rat-training-6mo-extdata/METHYL_SKMGN_RAW_DATA.rda
# White adipose:
# https://storage.googleapis.com/motrpac-rat-training-6mo-extdata/METHYL_WATSC_RAW_DATA.rda
# load_methyl_raw_data() can be used to download data for a tissue, e.g.,
# download the gastrocnemius data and load the data object into this session:
yall = load_methyl_raw_data("SKM-GN")
# for the simplicity of this example, we subset the data to 5000 loci
y = yall[1:5000,]
dea_res = rrbs_differential_analysis(y,PHENO,METHYL_META,adj_pct_unaligned=T)
head(dea_res$timewise)
head(dea_res$training)
# Alternatively, you can use the processed datasets.
# These are also available through the Google Cloud directory:
# https://storage.googleapis.com/motrpac-rat-training-6mo-extdata/
# File formats are \code{METHYL_XXX_NORM_DATA.rda} and \code{METHYL_XXX_DA.rda}
# where XXX is the tissue name:
# BAT (brown adipose), HEART, HIPPOC (hippocampus), KIDNEY, LUNG, LIVER,
# SKMGN (gastrocnemius skeletal muscle), and WATSC (white adipose)
# You can use load_sample_data() to download and load the normalized data, e.g.:
y_processed = load_sample_data("SKM-GN", "METHYL", normalized = TRUE)
# again, for the simplicity of this example, we subset the data to 5000 loci
y = y_processed[1:5000,-c(1:4)]
da_res = rrbs_differential_analysis(y, adj_pct_unaligned=TRUE)
head(da_res$timewise)
head(da_res$training)
} # }