Skip to contents

Timewise and training differential analysis wrapper for RRBS data.

Usage

rrbs_differential_analysis(
  y,
  PHENO = MotrpacRatTraining6moData::PHENO,
  METHYL_META = MotrpacRatTraining6moData::METHYL_META,
  verbose = TRUE,
  adj_pct_unaligned = FALSE,
  samples_to_remove = data.table::data.table(MotrpacRatTraining6moData::OUTLIERS)[assay
    == "METHYL", viallabel],
  edger_tol = 1e-05,
  dataset_name = ""
)

Arguments

y

A edgeR::DGEList() object. yall$genes is a metadata data frame with the locus coordinates (see details), and these fields at minimum: Chr, EntrezID, Symbol, and Strand

PHENO

A data frame with a row per sample. Contains at least the following columns: "sex", "group". MotrpacRatTraining6moData::PHENO by default.

METHYL_META

A data frame with a row per sample. Contains the RRBS pipeline QA/QC scores. Contains at least the following columns: "pct_Unaligned". MotrpacRatTraining6moData::METHYL_META by default.

verbose

A logical. TRUE: comments about the pipeline progress are printed.

adj_pct_unaligned

A logical. TRUE: adjust for percent unaligned reads. Default is FALSE.

samples_to_remove

A character vector. Contains the ids of the samples that should be removed (e.g., identified outliers or failed samples). METHYL samples in MotrpacRatTraining6moData::OUTLIERS by default.

edger_tol

A number. An internal parameter of edgeR. Default is 1e-05. Consider increasing if the algorithm takes too long.

dataset_name

A character. The name of the current dataset. Will be added to the output.

Value

A list.First item is called timewise and will contain the contrast-specific differential analysis results. Second item is called training and contains the overall training-level significance per locus.

Examples

if (FALSE) { # \dontrun{
data(PHENO)
data(METHYL_META)

# Raw data by tissue are available as RData files in Google Cloud Storage with the following URLs:   
# Brown adipose:
# https://storage.googleapis.com/motrpac-rat-training-6mo-extdata/METHYL_BAT_RAW_DATA.rda
# Heart: 
# https://storage.googleapis.com/motrpac-rat-training-6mo-extdata/METHYL_HEART_RAW_DATA.rda
# Hippocampus: 
# https://storage.googleapis.com/motrpac-rat-training-6mo-extdata/METHYL_HIPPOC_RAW_DATA.rda
# Kidney: 
# https://storage.googleapis.com/motrpac-rat-training-6mo-extdata/METHYL_KIDNEY_RAW_DATA.rda
# Lung: 
# https://storage.googleapis.com/motrpac-rat-training-6mo-extdata/METHYL_LUNG_RAW_DATA.rda
# Liver: 
# https://storage.googleapis.com/motrpac-rat-training-6mo-extdata/METHYL_LIVER_RAW_DATA.rda
# Gastrocnemius: 
# https://storage.googleapis.com/motrpac-rat-training-6mo-extdata/METHYL_SKMGN_RAW_DATA.rda
# White adipose: 
# https://storage.googleapis.com/motrpac-rat-training-6mo-extdata/METHYL_WATSC_RAW_DATA.rda

# load_methyl_raw_data() can be used to download data for a tissue, e.g.,
# download the gastrocnemius data and load the data object into this session:
yall = load_methyl_raw_data("SKM-GN")

# for the simplicity of this example, we subset the data to 5000 loci
y = yall[1:5000,]
dea_res = rrbs_differential_analysis(y,PHENO,METHYL_META,adj_pct_unaligned=T)
head(dea_res$timewise)
head(dea_res$training)

# Alternatively, you can use the processed datasets.
# These are also available through the Google Cloud directory:
# https://storage.googleapis.com/motrpac-rat-training-6mo-extdata/
# File formats are \code{METHYL_XXX_NORM_DATA.rda} and \code{METHYL_XXX_DA.rda} 
# where XXX is the tissue name:
# BAT (brown adipose), HEART, HIPPOC (hippocampus), KIDNEY, LUNG, LIVER, 
# SKMGN (gastrocnemius skeletal muscle), and WATSC (white adipose)

# You can use load_sample_data() to download and load the normalized data, e.g.:
y_processed = load_sample_data("SKM-GN", "METHYL", normalized = TRUE)
# again, for the simplicity of this example, we subset the data to 5000 loci
y = y_processed[1:5000,-c(1:4)]
da_res = rrbs_differential_analysis(y, adj_pct_unaligned=TRUE)
head(da_res$timewise)
head(da_res$training)
} # }