Skip to contents

Normalized DNA methylation (METHYL) data used for visualization. Only for training-regulated features at 5% IHW FDR. For sample-level data for all features, see METHYL_RAW_DATA, METHYL_RAW_COUNTS, and METHYL_NORM_DATA.

Format

A data frame with CpG sites in rows (feature_ID) and samples in columns (viallabel)

Source

pass1b-06/analysis/epigenomics/epigen-rrbs/normalized-data/*normalized-log-M-window.txt

Details

Unfiltered METHYL sample-level data are only available via download from Google Cloud Storage. For example, https://storage.googleapis.com/motrpac-rat-training-6mo-extdata/epigen-rda/METHYL_BAT_NORM_DATA.rda is the file for brown adipose tissue (BAT) data. You can change the name of the file to specify other tissues including: HEART, HIPPOC, KIDNEY, LIVER, LUNG, SKMGN (gastrocnemius skeletal muscle), and WATSC (subcutaneous white adipose tissue). You can also use MotrpacRatTraining6mo::load_sample_data() or MotrpacRatTraining6mo::get_rdata_from_url() to download raw and normalized sample-level data for ATAC and METHYL. For more details about these files see the readme of this repository at https://github.com/MoTrPAC/MotrpacRatTraining6mo/blob/main/README.md.

Only CpG sites with methylation coverage of >=10 in all samples were included for downstream analysis, and normalization was performed separately in each tissue. Individual CpG sites were divided into 500 base-pair windows and were clustered using the Markov Clustering algorithm via the MCL R package (Jager, 2015). To apply MCL, for each 500 base-pair window an undirected graph was constructed, linking individual sites if their correlation was >=0.7. MCL was chosen for this task as it: (1) determines the number of clusters internally, (2) identifies homogeneous clusters, and (3) keeps single sites that are not correlated with either sites as singletons (clusters of size one). The resulting sites/clusters were used as input for normalization and differential analysis with edgeR (Robinson et al., 2010). To generate this normalized sample-level data, the methylation coverages of filtered sites/clusters were first log2-transformed, and normalization was performed using preprocessCore::normalize.quantiles.robust() (Bolstad, 2021).

After performing differential analysis, training-regulated features were selected at 5% IHW FDR. The normalized data were filtered down to these features and provided here.

For the subset of normalized data corresponding to training-regulated features at 5% IHW FDR, see METHYL_NORM_DATA_05FDR.